AI and the New Scientific Method
The scientific method has worked for about four centuries. Observe something, come up with an idea about why, test the idea, see what happens. Rinse and repeat. The whole thing ran on one assumption: that human brains were the only machines capable of generating hypotheses and designing experiments.
That assumption is falling apart.
The old loop
The classic cycle goes like this:
Observation → Hypothesis → Experiment → Data → Theory
Every step required a human being sitting there and thinking. Searching the idea space was slow and expensive - constrained by what one person, or a small group of people, could hold in their heads at once. The scientific literature grew enormous but was basically unsearchable at scale. Publication cycles dragged on for months.
The entire structure was built for a world where the hard part was thinking. And for a long time, it was.
What’s actually changing
Here is what’s different now. AI systems can read and synthesize the entire literature in a domain. Not skim it. Read it. They can generate candidate hypotheses by spotting patterns across datasets, papers, and experimental results. They can design experiments by optimizing over parameter spaces. They can simulate outcomes before you commit to expensive real-world tests.
None of this replaces human science. What it does is change where the bottleneck sits.
The most striking example so far is AlphaFold. DeepMind’s model predicted the three-dimensional structure of virtually every known protein - about 200 million of them - in under a year [1]. Before that, the standard tool was experimental crystallography, which took months to years per protein and had been the central bottleneck in structural biology for decades. AlphaFold didn’t just speed things up. It changed which questions are worth asking in the first place.
In materials science, Google DeepMind’s GNoME system discovered 2.2 million new stable crystal structures in a single study [2]. That is roughly 800 times more than had been discovered in the entire previous history of the field. The limiting factor now is the lab, not the search.
And then there are systems that attempt to automate the entire research cycle. Sakana AI built something called “The AI Scientist” that generates research ideas, writes code, runs experiments, and produces draft papers - complete with self-critiques [3]. The papers aren’t publishable yet. But the trajectory is pretty clear.
The new loop
What’s emerging looks more like this:
Data → Model → Hypothesis generation → Targeted experiment → New data → Refined model
Humans don’t vanish from this loop. But the job changes. Instead of being the person who generates the hypotheses, you become the person who directs the search:
Which problems are actually worth solving? What constraints are non-negotiable? What does a “good” result look like? When is something a real signal and when is it noise?
These are judgment calls. You can’t automate them because they require taste, values, and domain understanding that the models don’t have.
What follows from this
If this new loop works - and early evidence says it does - a few things happen.
Discovery cycles compress. The AlphaFold and GNoME breakthroughs happened in the last five years. Drug discovery timelines, historically measured in decades, are already shortening as AI-generated candidates reach clinical trials. The speed of the cognitive part of research is dropping fast.
The bottleneck moves downstream. The hard part stops being “come up with an idea” and starts being “actually test it in reality.” Wet labs, physical prototypes, regulatory approval - these are the constraints that don’t compress easily. The cost of exploring an idea space has plummeted. The cost of validating one has not.
Small teams get access to research capabilities that used to require large institutions. You still need the equipment, but the intellectual search cost drops dramatically.
And new failure modes show up. Model-generated hypotheses can be systematically biased by training data. “Hallucinated” research directions are a real thing. If an AI confidently explores the wrong corner of a search space, it can do so very efficiently. Human judgment in filtering becomes more important, not less.
What humans actually do in this world
This is the part I keep coming back to.
In this new loop, humans become something like directors of intelligence. You set the goals, evaluate the outputs, and decide what deserves real-world investment. This is not a lesser role. It is arguably harder than hypothesis generation because it requires deep domain knowledge to evaluate what the AI gives you, interdisciplinary breadth to spot connections, judgment under uncertainty to decide what to pursue, and ethical reasoning to decide what should be pursued.
The scientists who thrive in this world will not be the ones who generate the most hypotheses on their own. They will be the ones who ask the best questions and evaluate the answers most clearly.
What this means
If scientific discovery gets even 2-3x faster, the downstream effects are enormous. Faster cures, better materials, cleaner energy, more effective economic models. But it also means the gap between institutions that adopt AI-augmented research and those that don’t will widen fast. Countries, companies, and universities that figure this out will compound their advantages.
The scientific method isn’t dying. It’s getting an upgrade. And the people who understand the upgrade will direct the next era of discovery.
References
[1] Jumper, J. et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596, 583-589. https://doi.org/10.1038/s41586-021-03819-2
[2] Merchant, A. et al. (2023). Scaling deep learning for materials discovery. Nature, 624, 80-85. https://doi.org/10.1038/s41586-023-06735-9
[3] Lu, C. et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292. https://arxiv.org/abs/2408.06292