In short: Google researchers have developed Mle-Star, an automatic learning agent that improves the AI model creation process by combining targeted web search, code refinement and adaptive assembly. Mle-Star has demonstrated its effectiveness by winning 63% of the Benchmark Mle-Bench-Lite Benchmark based on Kaggle, largely surpassing previous approaches.
Concretely, an agent Mle starts from a task description (for example, “predict sales from tabular data”) and sets of data provided, then:
-
Analyzes the problem and chooses an appropriate approach;
-
Generates code (often in python, with common or specialized ML libraries);
-
Test, assess and refine the solution, sometimes in several iterations.
-
Algorithmic reasoning (identify relevant methods for a given problem);
-
The generation of executable code (complete data preparation, training and evaluation scripts).
Their objective is to reduce the human workload by automating tedious steps such as characteristics engineering, hyperparameter adjustment or model selection.
Mle-Star: targeted and iterative optimization
Then, their exploration strategy is often based on a complete rewriting of the code with each iteration. This operation prevents them from concentrating their efforts on specific components of the pipeline, for example, systematically test different characteristics engineering options, before moving on to other stages.
-
Web search to identify specific models and constitute a solid initial solution;
-
Granular refinement by code blocks, based on ablation studies to identify the parties with the most impact on performance, then by optimizing them iteratively;
-
Adaptive assembly strategy, capable of merge several candidate solutions into an improved version, refined over the attempts.
This iterative process, research, identification of the critical block, optimization, then new iteration, allows Mle-Star to concentrate its efforts where they produce the most measurable gains.
Control modules to make solutions reluctantly
Beyond its iterative approach, Mle-Star incorporates three modules intended to strengthen the robustness of the solutions generated:
-
A debugging agent To analyze execution errors (for example, a traceback Python) and offer automatic corrections;
-
A data leak auditor pour Detecting situations where information from test data is used wrongly during training, a bias that distorts the measured performance;
-
A data use verifier To ensure that all data sources provided are used, even when they do not arise in standard formats such as the CSV.
These modules respond to current problems observed in the code generated by LLMS.