
TPC Spring 2025 Hackathon
6-8 May 2025
- May 6-7 Hackathon hosted by CSC-IT Center for Science, Espoo, Finland (Venue: Clarion Hotel Helsinki)
- May 8: Continued Hackathon en route to Kajaani Data Center (Lumi Supercomputing Facility) Site Visit and Hackathon
The hackathon in Espoo will begin at 9am on May 6 and conclude by 4:30pm on May 7. On May 8 the hackathon groups will combine and continue their work on an Intenet-connected rail car beginning at 8am en route to tour the Lumi supercomputing facility. A reception and tour of the Lumi facility will take place from 3-5pm on May 8th.
Background: The Trillion Parameter Consortium
The Trillion Parameter Consortium (TPC) convenes the community for the purpose of identifying and pursuing collaborations that will accelerate progress on responsibly and safely developing large-scale AI models for scientific discovery while expanding and diversifying the scientific AI community itself. In June 2024 at the European Kick-Off workshop in Barcelona, the groups shown below began to develop plans for collaborative projects. Further progress by TPC-facilitated collaborations and participating scientists was highlighted at the SC’24 TPC Workshop, attended by some 200 participants in November 2024.
This hackathon will convene multiple TPC working groups for hands-on work ranging from key decisions regarding responsibilities and approaches to running tests and benchmarks. Substantive outcomes are expected, as documented in the report from the Fall 2024 TPC Hackathon hosted by Argonne National Laboratory and The University of Chicago in October 2024.
Participation
Multiple working groups have been formed during the past 18 months of TPC boot-up, developing collaborative plans ranging from data sharing strategies to optimizing training pipelines or developing scientific skills evaluation frameworks. These self-assembled teams plan to meet, using this open call-for-participants to identify and invite participants who can help to accelerate -on progress. The meetings will be hands-on and in-person with limited opportunities for remote participation at the discretion of group co-organizers.
Program
The hackathon will begin with a two-day intensive hackathon with at four breakout groups, followed by a combined hackathon to take place on an Internet-connected rail trip to tour the Lumi supercomputing facilities in Kajaani, Finland.
Day 1 (Tuesday, 6-May)
- 09:00-11:00 General Opening with Invited Talks and Detailed Plans and Updates for Hackathon Sessions
- Welcome and Orientation – Per Öster, Kimmo Koski
- Presentation (10 min talk): Date Pipes Hacking Goals – Charlie Catlett, Ian Foster
- Presentation (10 min talk): MAPE Hacking Goals – Gokcen Kestor, Marc Clascà
- Presentation (10 min talk): HPC User AI Assistants Goals – Aleksi Kallio
- Presentation (30 min talk; 30 min discussion): Implementing the Google Co-Scientist (and how you can do it) – Arvind Ramanthan, Rick Stevens
- Open Mic – 30 min
- 11:00-13:00 Parallel hackathon sessions
- The Life Sciences Co-Scientist
Rick Stevens (Argonne), Arvind Ramanathan (Argonne), Neeraj Kumar (PNNL) - Data, Evaluation, and Training Pipelines
Charlie Catlett (Argonne), Ian Foster (Argonne) - Model Architecture and Performance Evaluation
Marc Clascà (BSC), Gokcen Kestor (BSC) - HPC User AI Assistants
Aleksi Kallio (CSC)
- The Life Sciences Co-Scientist
- 13:00-14:00 Lunch
- 14:00-17:00 Parallel hackathons continue
- 18:00-19:30 Welcome reception
Day 2 (Wednesday, 7-May)
- 09:00-12:00 Parallel Hackathon Sessions
- 12:00-13:00 Lunch
- 13:00-16:30 Parallel Hackathon Sessions
Day 3 (Thursday, 8-May)
- 08:00 Convene at Helsinki Rail Station (here under the large information display)
- 08:15 Train Departs for Kajaani
- 09:00-09:30 Overview of Combined Hackathon Objectives and Team Assignments
- 09:30-11:30 Small (3-4 person) groups undertake hackathon assignments
- 11:30-13:00 Lunch
- 13:00-14:00 Groups report out
- 14:35-14:45 Arrive Kajaani, transit to Kajaani Data Center
- 14:45-16:30 Kajaani Data Center Tour
- 16:30-17:00 Transit to Hotel
- 18:30 Group Reception and Dinner
The hackathon program will end with dinner on Thursday. Most participants will fly home on Friday.
Working Group Breakout Descriptions
Four breakouts will run simultaneously on days 1-2, and on day 3 these groups will combine for a group hackathon and discussions.
Group A: The Life Sciences Co-Scientist
This group will build on the agentic co-scientist systems created at the TPC Winter Hackathon in Kobe, applying the system to selected life sciences discovery challenges.
This hackathon group will build on prior work to flesh out an advanced, multimodal dataset for each open reading frame (ORF) in the Mycobacterium tuberculosis (Mtb) genome. This hackathon objective is to create a resource that combines several layers of biological information—phylogenetics, multiple sequence alignments, pinned genomic diagrams, functional context, transcriptomic profiles (e.g., read pileups), and proteomic interaction data—all cross-referenced to ensure compatibility with major databases.
Sub-teams will tackle different facets of the agent co-Scientist system, such as generating phylogenetic trees to illustrate evolutionary relationships, performing sequence alignments to compare ORFs across various Mtb strains, visualizing ORFs in their genomic context using pinned diagrams, integrating functional annotations and linking them to biological pathways, etc.
The group will also develop a plan for extending this framework to other pathogens, including viruses.
Group B: Data, Evaluation, and Pipelines
This group will focus on improving and optimizing specific pipelines for tasks including (a) generating multiple choice questions (MCQs) from scientific papers, (b) tasking multiple AI models to answer MCQs, using AI models to evaluate those answers, (c) preparing and using the MCQs to fine-tune pre-trained models. The group will investigate converting this workflow pipeline into an agentic system as well as exploring alternate training data preparation, such as extracting facts from papers, then using the target model (to be fine-tuned) to filter out new knowledge (facts that the model does not already know).
Group C: Model Architecture and Performance Evaluation: Hands-On Model Performance Profiling
As the scale and complexity of AI models continue to grow, understanding and optimizing model performance is becoming increasingly critical. This working group will focus on practical tools and methodologies for profiling large AI models, identifying bottlenecks, and analyzing performance at both the system and application level.
On Day 1, we will introduce the Paraver profiling tool and provide a hands-on tutorial on how to instrument AI models using the NVIDIA Tools Extension SDK (NVTX) events. Participants will learn how to generate traces and visualize them using Paraver to gain actionable insights into model behavior. In the afternoon, attendees will have the opportunity to analyze their own collected traces with guidance from the organizers.
On Day 2, we will shift to presentations and discussion. Invited speakers from institutions such as CINECA, RIKEN, CSC, and, BSC will share recent work on profiling LLMs and explore how profiling approaches are evolving to support multimodal models. This session will include discussions on early vs. late fusion strategies, training stages, architectural differences, and data related challenges in multimodal systems. The day will conclude with an open forum for discussing key challenges, research directions, and opportunities for collaboration on performance profiling tools and techniques for the next generation of AI models.
Group D: HPC User AI Assistants
This group will work on concepts and prototypes for AI assistants to help HPC users. The basic form would be a chatbot augmented with public user documentation. A more advanced version would be an AI assistant with access to the user’s own data (home directory etc.), in addition to documentation. The goal is to enable the model to answer questions such “why did my last job fail”, being able to figure out where to find that error message and then reason about the causes.
On Day 3 a joint discussion and hackathon exercise will be undertaken by all participants, breaking into groups of 3-4 for each of several sub-tasks. This day will be held in a meeting room carriage on a 6-hour train route to the Kajaani Data Center.
Resources for Participants
Multiple resources are being made available for the hackathon, including a 50-GPU NVIDIA cluster, a Cerebras system, and access to LUMI.
All participants should already be using the hackathon-finland-2025 channel in the TPC Slack workspace. This channel is where all technical support will take place before and during the hackathon.
If you were not able to join the NVIDIA primer on using the cluster they are providing, please watch the recording (link is in the Slack channel).
Registration and Logistics
Participants are responsible for all of their travel arrangements and costs.
Registration (Now CLOSED)
Day 1 and 2 registration is €200 ($215) to cover meeting and catering costs.
Day 3 is an additional €30 ($35) for rail fare.
Due to space limitations in the meeting rail car, participation in the Kajaani trip on day 3 can only accommodate the first 35 registrants.
Hackathon Venue and Lodging
May 6-7
The hackathon sessions on May 6 and 7 will be held at the Clarion Hotel Helsinki. Please use the discount code “EVENTSHELSINKI” to get a 15% discount off the regular rate. Other hotel options near the Clarion include Radisson Blu Seaside (8 min walk), and Hotel AX (8 min walk), and Home Hotel Jugend (15 min walk).
May 8
Transit: Doors will close on the train to Kajaani at 8:15am. Participants will convene at the Helsinki Päärautatieasema (main rail station) between 7:45 and 8:00 am. The station is a 35-minute walk (14 min by public transit, 12 min by taxi) from the Clarion. Participants will meet in the open area in front of the main departure status board located near track 7.
Lodging: The recommended hotel for the visit to the Kajaani Data Center is Hotel Valjus.
Program Committee
- Charlie Catlett (Argonne National Laboratory, USA)
- Marc Clascà (Barcelona Supercomputing Center, Spain)
- Ian Foster (Argonne National Laboratory, USA)
- Aleksi Kallio (CSC IT Center for Science, Finland)
- Gokcen Kestor (Barcelona Supercomputing Center, Spain)
- Kimmo Koski (CSC IT Center for Science, Finland)
- Per Öster (CSC IT Center for Science, Finland)
- Neeraj Kumar (Pacific Northwest National Laboratory, USA)
- Pekka Manninen (CSC IT Center for Science, Finland)
- Arvind Ramanathan (Argonne National Laboratory, USA)
- Rick Stevens (Argonne National Laboratory, USA)

