Will AI for Science Have a 'ChatGPT Moment'? Insights for Young Innovators

Explore the potential of AI for Science and its implications for young entrepreneurs in the evolving landscape of AI-driven research.

Will AI for Science Have a ‘ChatGPT Moment’? Insights for Young Innovators

As AI reshapes the foundational logic of research and industry, AI for Science is no longer just a theoretical concept. On April 28, the Future Light Cone collaborated with Beijing Zhongguancun Academy’s AI Business School to launch the “AI for Science Innovators Dialogue Series.” The first event featured three frontline guests, including Zheng Shuxin, an associate professor and co-director of the AI Business School, who provided solid data and insights to address three pressing questions: Will AI4S experience a ‘ChatGPT moment’? What barriers do entrepreneurs face? How should young people invest their efforts?

The Essence of Large Models: Intelligence Through Compression

What drives the general intelligence of large models? Ilya, former chief scientist at OpenAI, succinctly stated: “Intelligence arises from compression.” The intelligence of a model comes from its ability to compress vast amounts of human language data using a relatively small parameter space. In this process, the model is compelled to distill common structures and inherent representations from the data, leading to the emergence of intelligence.

For instance, the first version of GPT-3, with 175 billion parameters, aims to encapsulate nearly all text ever written by humanity. If it relied solely on memory, it would essentially function as a hard drive, which does not exhibit intelligence. However, when tasked with compressing this data into a smaller parameter space, it is forced to extract common structures and representations—intelligence emerges from this compression.

A more rigorous theoretical foundation underpins this, known as Kolmogorov complexity, which measures the complexity of a dataset by the length of the shortest program that can describe it. For example, a dataset consisting entirely of zeros can be compressed into a single line of Python code due to its simple internal structure. The paradigm of large language models predicting the next word is, in fact, a good approximation of Kolmogorov programs.

However, this also sets a ceiling: human knowledge. You cannot learn from humans and ultimately surpass them. AI for Science, however, is charting a completely different path.

Two Core Paths of AI4S

AI4S does not engage with human language; it directly studies physical laws, biological processes, and molecular conformations, compressing the data of nature itself rather than “how humans describe nature.”

A prime example is AlphaFold, which represents Nobel-level work. What does it do? Quite simply, it finds correlations within natural data. When the Protein Data Bank (PDB) accumulates hundreds of thousands of protein structure data points, the model can map sequences to three-dimensional structures, effectively “solving” the protein structure problem.

Here lies a core analytical framework, the two legs of AI4S:

  • Scientist: Engages with literature, formulates hypotheses, and designs experiments, essentially combining language intelligence, knowledge integration, and logical reasoning. Its strengths lie in reasoning and knowledge, while its weakness is a lack of direct understanding of the physical world. Representatives include OpenAI, Anthropic, and DeepMind.

  • Simulator: Uses AI to fit the laws of the physical world through data-driven methods. Its strengths lie in modeling the world itself, which cannot be achieved merely by stacking parameters. However, it lacks explicit knowledge chains and reasoning capabilities. Representatives include AlphaFold and various meteorological models.

The ultimate goal of large models is AGI (Artificial General Intelligence), while the vast potential of AI4S lies in breaking the boundaries of human cognition—the universe is unknown, and only the Simulator path theoretically allows AI to explore what humanity has yet to discover.

Image 4

However, today, the Simulator cannot solve all problems on its own—it lacks logic and reasoning. Relying solely on either path is insufficient. The true endgame of AI4S is the convergence of both paths: the ability to reason and formulate hypotheses like top scientists while directly understanding the physical world itself.

This is why I repeatedly emphasize that AI for Science requires more than just larger models. Even if you scale GPT up by 100 times, it won’t automatically understand how a protein folds or how a cloud evolves.

Currently, no single team possesses both ends, which presents an opportunity.

AI4S Will Not Experience a Unified ‘ChatGPT Moment’

My core judgment is that AI4S will see continuous breakthroughs, but it will not be a single moment of universal celebration; its progress resembles a highly uneven map.

In a given field, the more it meets the criteria of “clear problem structure + sufficient data + short validation loop,” the faster AI4S will advance there.

  • Protein Folding: In this area, both the Scientist and Simulator paths have produced significant results. AlphaFold answers “what proteins look like,” while DiG and BioEmu address “how proteins move”—one captures still images, while the other creates movies. Only by producing the movie can the functional mechanisms of proteins be truly explained.

  • AI Drugs: This field has crossed a critical threshold. There are over 200 AI drug clinical pipelines, with Phase I success rates of 80%-90%, double that of traditional methods. The first AI drug has shown efficacy in Phase II clinical trials, with a crucial data readout window expected in 2026-2027.

  • AI Meteorology: Chinese players are leading globally. Huawei’s Pangu, Fudan’s Fuxi, and the Fengwu model continue to make breakthroughs, with Fengwu achieving accurate forecasts up to 11.25 days, marking the first global breakthrough of the 10-day accuracy barrier.

  • Materials Science: This field is evolving from merely screening known compounds to designing unprecedented molecules from scratch. The most critical signal for 2025-2026 is that frontline model developers are beginning to truly believe in the tools at their disposal. Though this field is still early-stage, the potential value is immense once breakthroughs occur.

Image 5

Barriers for Entrepreneurs Amidst Major Players Entering AI4S

An undeniable fact is that the six major AI giants—OpenAI, Anthropic, Google DeepMind, Microsoft, NVIDIA, and Meta—are all entering the AI4S arena.

Even OpenAI is developing a specialized life sciences model, GPT-Rosalind, and Anthropic is fully investing in Claude for Life Sciences, indicating a quiet abandonment of the narrative that “a universal model can solve everything.”

With these giants entering the field, where do entrepreneurs face barriers? My answer is clear: the threshold lies not in prompts and workflows, but in scientific capability, data closure, and depth of industry integration.

It’s essential to clarify which game you are playing:

  • Product-oriented: Competing on rapid iteration and user stickiness, with validation cycles from days to weeks, represented by Manus and Cursor.

  • Resource-oriented: Competing on depth of industry integration and client resources, with validation cycles from quarters to years, represented by traditional SaaS and industry solutions.

  • Scientific Story-oriented: Competing on scientific capability and data flywheel, represented by Isomorphic Labs, with validation cycles from years to decades.

Image 6

AI4S companies can be divided into two categories: scientific companies (scientific story-oriented) and scientific service companies (resource-oriented). Both paths are viable, but the greatest risk is mistaking oneself for a “scientific company” while ultimately becoming a “scientific service company.”

If you are confident in your technology and can truly unearth valuable insights, you should naturally tell a scientific story. If you still have some gaps, focus on delivery and client resources, and earnestly deepen your industry engagement.

Now is the Golden Window for AI4S

Why do I say now is the window period? Because funding is already moving. A single AI4S company can secure annual funding of up to $550 million, and a significant portion of global VC funds flowing into AI is increasingly directed towards AI4S. The U.S. Department of Energy has invested $320 million to launch the Genesis program, with China following suit.

Why is funding concentrated on AI4S? Due to a combination of technological breakthroughs, the inefficiency of traditional R&D, the nascent data infrastructure, and national strategic support, a fourfold resonance has formed.

Even if there are bubbles that burst in the process, this is fundamentally different from the industry boom of five or six years ago—this time, the technology has genuinely reached a critical point.

  1. Self-Driving Labs: Achieving a complete closed loop of “hypothesis → experiment → data → model update → new hypothesis,” where the more experiments conducted, the better the model becomes, forming a true flywheel. Key players include Lila Sciences, Recursion, and Atinary.

  2. National-level AI4S Infrastructure: AI4S is transitioning from “academic research” to “industrial infrastructure,” which is a core layout for national competitiveness.

Five Hard-Hitting Suggestions for Young Innovators

  1. Choosing a field is more important than selecting a technology. The real moat is domain knowledge, not model architecture; choose a scientific problem you are willing to immerse yourself in for five years.

  2. Learn to communicate with experiments. Those with purely computational backgrounds often lack understanding of experiments. Spending three months in a lab is more beneficial than reading ten papers.

  3. Data capability is a core lever. The performance ceiling of a model ultimately depends on the information limit of the training data. Those who can build a data flywheel are far more valuable than those who can merely tune models; acquiring, cleaning, and labeling scientific data is hard currency.

  4. Clarify which game you are playing. Scientific story-oriented requires long-term patience, resource-oriented needs industry integration, and product-oriented focuses on rapid iteration—don’t mix them.

  5. Now is the window period. The convergence of technology, capital, and national strategy is happening, but the window won’t remain open forever.

Three Core Conclusions

Returning to the three initial questions, the answers are now very clear:

  1. AI4S will have continuous breakthroughs but will not have a unified “ChatGPT moment.”
  2. The core barrier for entrepreneurs lies in “scientific capability + data closure,” not in model size.
  3. Choosing the right direction fundamentally means selecting a scientific problem you are willing to delve into for five years.

In conclusion, the window belongs to those willing to do the heavy lifting and dare to bet amidst uncertainty.

Was this helpful?

Likes and saves are stored in your browser on this device only (local storage) and are not uploaded to our servers.

Comments

Discussion is powered by Giscus (GitHub Discussions). Add repo, repoID, category, and categoryID under [params.comments.giscus] in hugo.toml using the values from the Giscus setup tool.