LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers
LLM-FE is a novel framework for automated feature engineering in tabular data, integrating Large Language Models' (LLMs) domain knowledge and reasoning with evolutionary search. It formulates feature engineering as a program search problem, where LLMs iteratively propose and refine feature transformations. Guided by data-driven feedback, LLM-FE consistently outperforms state-of-the-art baselines, significantly enhancing tabular prediction model performance. ✨
Article Points:
1
LLM-FE: LLMs + evolutionary search for automated tabular feature engineering.
2
Formulates FE as program search; LLMs propose, data feedback refines transformations.
3
Leverages LLM domain knowledge and iterative data-driven feedback for feature discovery.
4
Consistently outperforms state-of-the-art baselines in tabular prediction tasks.
5
Enhances performance across diverse models: XGBoost, MLP, TabPFN.
6
Domain knowledge, evolutionary search, and feedback are crucial for LLM-FE's impact.
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers
Problem
Traditional FE: limited search space, no domain knowledge
LLM-based FE: direct prompting, no prior insights
Tabular data: challenging, vast combinatorial space
Approach
Combines LLMs' domain knowledge & reasoning
Uses evolutionary search for feature optimization
Formulates FE as a program search problem
Iterative generation & data-driven feedback
Key Components
Feature Generation: LLM creates programs
Data-Driven Evaluation: Model performance as reward
Experience Management: Multi-population memory
Structured Input Prompt: Guides LLM
Performance
Outperforms SOTA baselines consistently
Enhances XGBoost, MLP, TabPFN models
Effective on classification & regression tasks
Robust to noise, computationally efficient
Impact
Reduces manual effort, improves predictive power
Generates interpretable, contextually relevant features
Generalizable across models & LLM backbones
Future Directions