Advances in Generative AI for HPC Science and Engineering: Foundations, Challenges, and Opportunities

All accepted papers for this peer-reviewed workshop will be included in the SC26 Proceedings. Deadline for submission of 5p proposals is July 10, 2026.

  • Workshop Date and Times (tbd)
  • Location: SC’26 Workshop (November 15-20, 2026; Chicago, USA)

The landscape of AI for science has shifted dramatically. Beyond solely building ever-larger models from scratch, the scientific community is increasingly leveraging powerful pre-trained models, adapting them through fine-tuning, domain specialization, and integration into complex AI systems that combine reasoning models, knowledge graphs, agentic workflows, and connections to simulators and experimental instruments. These shifts emphasize the importance of a full stack of interdependent challenges for developers, scientists, and HPC centers, ranging from strategies and infrastructure for sharing data and computing to frameworks that orchestrate multiple models and agents to strategies, methods, and tools for ensuring safety and alignment and for rigorous evaluation of model scientific reasoning and skills. Progress in any one of these areas both depends and accelerates the others. This workshop will highlight advances across this interconnected landscape, with an emphasis on openness, international collaboration, and community-driven efforts. The workshop will feature peer-reviewed papers and invited talks spanning shared infrastructure, open model development, agentic AI systems for science, evaluation and safety, and driving challenge applications..

This proposed SC’26 workshop is being organized by leaders of the International Trillion Parameter Consortium, a grass-roots community of nearly 1800 participants from more than 100 organizations worldwide..


Workshop Structure

This workshop aims to provide opportunities for novices and experts alike to explore opportunities for collaboration toward building very large-scale AI models and systems for science and engineering. The general structure of the workshop will be four sessions, each including a 20-minute invited talk and multiple 15-minute ‘lightning’ talks. Lightning talks will be selected with peer review from a call for papers, soliciting 5-page papers to contribute to the SC’25 proceedings.

International, multi-institutional, and multi-disciplinary collaboration are high priorities for this workshop. Thus, the review criteria articulated includes prioritization of high-quality submission from such teams. An equally important goal of this workshop is to attract early- and mid-career scientists from across the SC community, from academics and educators to HPC and AI provider companies and organizations to computer and computational scientists to disciplinary scientists and encompassing those involved in career development and training programs.

The workshop will be moderated by an organizing team comprising Charlie Catlett (Argonne National Laboratory and The University of Chicago, USA), Javier Aula-Blasco (Barcelona Supercomputing Center, Spain), Kyoungsook Kim (AIST, Japan), Gabrielle Scipione (CINECA, Italy), Valerie Taylor (Argonne National Laboratory and The University of Chicago, USA), Rio Yokota (Institute for Science Tokyo, Japan).

A review team will include these six organizers as well as at least nine additional TPC leaders.


Call for Papers

The science community has been on a steep learning curve during the past several years as it works to harness generative AI. Discipline-specific foundation models continue to proliferate, ranging from Microsoft’s Aurora for Earth system forecasting to Meta’s Open Molecules (OMat25) dataset and models for atomistic simulation of materials and chemistry, to GridFM, an international consortium of over 100 organizations developing a foundation model for the electric power grid.

Open model efforts have gained significant momentum, highlighted by the NSF and NVIDIA $152 million investment in Ai2’s Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project and by Japan’s Fugaku-LLM, led by Rio Yokota at Institute for Science Tokyo with RIKEN and Fujitsu, which demonstrated that large-scale model training can be accomplished on non-GPU supercomputing architectures. Models from industry sources such as OpenAI, Meta, Google DeepMind, Anthropic, DeepSeek, xAI, and others have introduced powerful reasoning capabilities, with varying degrees of openness and opportunities for collaboration with the science community. DeepSeek’s R1 model, in particular, demonstrated that efficient training innovations and open-weight release can dramatically lower barriers to advanced reasoning, influencing research worldwide. Rather than solely pursuing ever-larger models trained from scratch, the community has increasingly shifted toward leveraging these powerful pre-trained models, adapting them through fine-tuning and domain specialization and integrating them into complex AI systems that combine reasoning models, knowledge graphs, agentic workflows, and connections to simulators and experimental instruments. New architectures such as Mixture of Experts, and the rapid adoption of multi-agent orchestration strategies. Emerging interoperability standards such as Anthropic’s Model Context Protocol (MCP) and Google’s Agent-to-Agent Protocol (A2A) have dramatically improved AI capabilities, while the science community continues to evaluate and pursue improvements in scientific reasoning.

This potential has motivated substantial community-wide efforts, including multiple working groups within the Trillion Parameter Consortium (e.g., scientific data preparation, training and inference performance optimization, scientific skills and reasoning evaluation, safety and alignment, and discipline-specific groups ranging from biology to material science). Increased partnerships with the AI industry emerged throughout 2025, including the “1000 Scientist AI Jam Session” organized by nine U.S. Department of Energy laboratories with OpenAI and Anthropic, which was replicated and improved in Japan with a “Japan Scientist AI Jam Session 2025” organized by RIKEN and the Institute for Science Tokyo with industry partners OpenAI, Anthropic, Google, and AWS. TPC’s own growth—from its founding in 2023 to its inaugural all-hands conference (TPC25) attracting 360 participants in San Jose—reflects the community’s recognition that these challenges must be addressed collectively.

This potential for scientific transformation has similarly stimulated national and international efforts to accelerate AI adoption and advances.  In late 2025 with the announcement of the U.S. Genesis Mission’s goal to accelerate the application of AI for transformative scientific discovery, the U.S. Department of Energy (DOE) was charged with national community leadership in cooperatively developing the necessary platform to combine AI capabilities with federal data and experimental infrastructure.  Recently, DOE announced the 26 science and engineering challenges which will motivate the development of a unified platform.    There is strong alignment of this effort, both the development of the shared platform elements and the accompanying scientific challenges, with the goals of the TPC.

High-Performance Computing continues to be at the center of these endeavors, but the demands on HPC centers are rapidly evolving. Beyond the enormous cost of model training, centers now face growing requirements for inference services to provide broad scientific communities with access to the latest reasoning and foundation models. New AI-coupled workflows combining traditional simulation with real-time AI inference, agentic orchestration, and connections to experimental instruments create data flow and scheduling challenges that were not fully contemplated in the design stages of conventional HPC architectures and operational models. At the same time, the shift toward complex AI systems that orchestrate reasoning models, domain foundation models, agentic workflows, and laboratory instruments demands new software infrastructure and frameworks that go well beyond model training alone. These trends offer significant opportunities to leverage TPC events and relationships to collaborate, both to accelerate progress toward optimal tools and methods and to enable groups to strategically reduce duplication of effort, such as in data preparation of new scientific data sources. Moreover, many new challenges are coming to the fore, including new ways of thinking about data sharing and attribution, about licensing artifacts, about embedding safety and alignment directly into training and deployment processes, and about the importance of responsible development of AI models and systems operating at extreme scale and in high-impact scientific settings. An overarching need today in AI is openness. This includes open data, open source code for tools and workflows, open evaluation suites for assessing model skills, knowledge, reasoning, and safety, and careful thought as to how, when, and whether to open the models themselves. Such openness is critical to progress in every area related to AI for science.

The enormity of these challenges, and of the resources needed for data preparation, pre-training new models, building the software infrastructure to support frontier AI systems, and responsibly preparing them for downstream applications has meant that progress is largely concentrated in industry, where there is limited, or in some cases, no visibility into the artifacts (models, data sets) or the processes used to create them. This underscores the need for collaboration in the open science community—central to the motivation behind creating the international Trillion Parameter Consortium (TPC). Equally critical is the workforce required to advance this full stack of challenges: emerging roles span the frontier AI pipeline from data curation and model training to agentic system design and safety engineering, and developing this workforce across all career stages is essential for sustained progress.

This workshop aspires to stimulate new thinking, attract scientists to the emerging challenges associated with frontier AI systems for science, and potentially catalyze the formation of new topical collaborative working groups supported by TPC. Building on this rapidly evolving landscape, the TPC workshop is poised to serve as a vital nexus for cross-disciplinary dialogue and innovation. We anticipate that by fostering transparent discussions and collaborative initiatives, participants will not only refine their understanding of the evolving challenges but also forge actionable partnerships and form new working groups. These outcomes aim to bridge the gap between industry-led advancements and open scientific research, ultimately shaping standardized practices for data sharing, open evaluation, ethical model development, resource-efficient training, and the responsible deployment of increasingly autonomous AI systems for science.


TOPICS OF INTEREST

Topics below are of particular interest, but are not intended to be an exclusive or exhaustive list. Papers from collaborative teams, particularly involving scientists from multiple sectors (industry, academia, national laboratories, etc.), multiple scientific disciplines, and multiple countries will be prioritized for acceptance for this workshop.

Infrastructure to Enable Shared Data and Computing

  • Scalable methods to curate, preprocess, store, and share scientific training data across institutions and disciplines.
  • Shared computing infrastructure and resource allocation strategies for foundation model training and fine-tuning.
  • High-throughput data pipelines for integrating observational, simulation, and experimental data from diverse scientific domains.
  • Data governance, attribution, and licensing frameworks for open scientific AI resources.

Open Frontier Models

  • Distributed training approaches (e.g., pipeline parallelism, tensor parallelism, mixture-of-experts) for building frontier-scale open models across partner institutions.
  • Optimization techniques for large-scale model training, including mixed-precision training, adaptive checkpointing, and gradient compression.
  • Domain-specific foundation models and fine-tuning strategies for fields such as Earth systems, materials science, biology, chemistry, and energy.
  • Openness practices for model weights, training data, code, and documentation to enable transparency, reproducibility, and scientific reuse.

Open Frontier AI Systems

  • Architectures for frontier AI systems that integrate reasoning models, domain foundation models, knowledge graphs, and agentic workflows.
  • Multi-agent orchestration strategies and interoperability standards for coordinating models, tools, simulators, and experimental instruments.
  • AI-augmented scientific discovery including autonomous laboratories, hypothesis generation, and AI-driven experimental design.
  • Approaches for incorporating state-of-the-art closed and open reasoning models into scientific AI system compositions.

Software Infrastructure and Frameworks

  • Middleware and frameworks for training, deploying, and serving frontier-scale AI models and complex multi-model systems.
  • HPC inference services: methods to reduce latency, manage memory, and provide broad scientific communities with access to the latest models.
  • Software integration with experimentation platforms, laboratory instruments, and real-world scientific environments.
  • Tools for managing new AI-coupled HPC workflows including hybrid simulation-inference pipelines and agentic data flows.

Open Suite for Evaluating Model Skills, Knowledge, Reasoning, and Safety

  • Benchmarks and metrics for assessing scientific reasoning, domain knowledge, and agentic capabilities of frontier models and AI systems.
  • Evaluation of AI as a research assistant: literature synthesis, hypothesis generation, cross-disciplinary knowledge connection, and experimental design.
  • Methods for evaluating creativity, extended reasoning with deep data access, and generalization across scientific domains.
  • Scalable red-teaming and stress-testing of AI systems leveraging high-performance computing.

Driving Challenge Applications

  • Community-driven scientific challenge applications that exercise and evaluate shared infrastructure, open models, and frontier AI systems.
  • Real-world case studies demonstrating the impact of generative AI across diverse scientific and engineering domains, including industry applications.
  • Multi-scale and multi-physics applications bridging molecular dynamics to global-scale phenomena using AI-augmented simulation.
  • Approaches for selecting, coordinating, and learning from challenge applications across disciplines without centrally picking winners.

Training- and Deployment-Level Safety and Alignment

  • Methods to embed safety and alignment directly into the training and deployment of frontier-scale models and AI systems.
  • System-level mechanisms for maintaining alignment with scientific objectives and broader societal values in complex multi-model compositions.
  • Bias detection and mitigation, adversarial robustness, and explainability techniques for high-impact scientific AI settings.
  • Uncertainty quantification and physically consistent AI to ensure reliability in safety-critical scientific applications.

Workforce Development

  • Emerging and evolving roles across the frontier AI stack, from data curation and model training to agentic system design and safety engineering.
  • Training programs, curricula, and mentorship models for developing AI-for-science capabilities across all career stages.
  • Experiences and lessons learned from workforce development initiatives at HPC centers, universities, and national laboratories.
  • Strategies for broadening participation and growing leadership among early- and mid-career scientists in the AI/HPC community.

SUBMISSION GUIDELINES

Submissions must follow the guidelines of the SC’26 papers with respect to formatting, however, the main text of a submitted paper was limited to five content pages (no exceptions), including all figures and tables. References and optional technical appendices, with additional results, figures, and graphs, do not count as content pages. There is no page limit for the technical appendices. Reviewers are not required to review appendices. Authors are encouraged to participate in the SC Reproducibility Initiative. In keeping with the TPC goals of creating both national and international collaborations among institutions, and disciplines, the review scoring rubric takes into account the composition of author teams. All submissions will be reviewed by the program committee, following SC’s double blind review process, with the schedule below (11:59pm anywhere on Earth on the specified date).

Deadlines

  • 2 April 2026: Submissions Open
  • 10 July 2026: Submissions Close
  • 24 July 2026: AD Appendix Due (encouraged but optional)
  • 28-30 July 2026: Review/Rebuttal Period
  • 3 August 2026: Notifications Sent
  • 28 August 2026: Final Paper Due