The World's First "AI Scientist" Emerges
The first "AI Scientist" has emerged, generating 10 complete academic papers in one go.
From proposing research ideas, checking for novelty, designing experiments, writing code, to executing experiments on GPUs and collecting results, and finally completing the paper writing - all done automatically by this "AI Scientist".
The cost of each paper is about $15 (approximately 107.62 yuan).
This is the first comprehensive AI system for automated scientific research and open-ended discovery - ### The AI Scientist.
It comes from Sakana AI, a startup co-founded by Llion Jones, one of the authors of the Transformer paper.
Moreover, the company not only created an AI scientist, but also developed an AI reviewer.
The AI reviewer can evaluate papers written by AI and provide improvement suggestions.
Both the AI scientist and AI reviewer have been open-sourced by Sakana AI.
AI Independently Completes Ten Machine Learning Papers
For decades, after each major AI breakthrough, researchers often joked: "It's time to research how to get AI to write papers for us."
Now, this idea has finally become a reality.
Specifically, the AI scientist generated ten papers, selecting one high-scoring paper from each research direction to introduce.
The first paper on diffusion models: "Dual-Scale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Models"
It proposes an adaptive dual-scale denoising method to improve existing diffusion models' difficulty in capturing both global structure and local details in low-dimensional spaces.
The second paper on language models: "StyleFusion: Adaptive Multi-Style Generation in Character-Level Language Models"
This paper proposes a new method called Multi-Style Adapter, which enhances style awareness and consistency in character-level language models by introducing learnable style embeddings and style classification heads.
The third paper combining Transformers and reinforcement learning: "Adaptive Learning Rate for Transformers via Q-Learning"
This study explores applying reinforcement learning to dynamically adjust the learning rate in transformer model training.
The fourth paper on the "grokking" phenomenon proposed by Google's team: "Unlocking Grokking: A Comparative Study of Weight Initialization Strategies in Transformer Models"
This paper systematically studies the impact of weight initialization on grokking for the first time, comparing five weight initialization strategies to optimize neural network learning dynamics.
The code accompanying these papers (also generated by AI) is open-sourced on GitHub, emphasizing reproducibility.
How the First "AI Scientist" Was Created
The entire research idea is a continuation of several achievements after Sakana AI was established:
First, they developed a method to automatically merge knowledge from multiple large models and evolve to produce new models. In recent work, they used large models to discover new objective functions to fine-tune other models.
The team was constantly surprised by the creativity of current cutting-edge models in these projects, leading to a bigger dream: ### Can large models be used to automate the entire research process?
The final result was completed through collaboration between Sakana AI, Oxford University's Foerster Lab, and the University of British Columbia team.
The "AI Scientist" system consists of four parts:
Idea Generation:
Given a starting template, AI first "brainstorms" a series of different novel research directions and searches on Semantic Scholar to verify if these ideas have been done before.
Experiment Iteration:
For the ideas proposed in the first part, [...]