| Issue |
ITM Web Conf.
Volume 78, 2025
International Conference on Computer Science and Electronic Information Technology (CSEIT 2025)
|
|
|---|---|---|
| Article Number | 03023 | |
| Number of page(s) | 8 | |
| Section | Intelligent Systems and Computing in Industry, Robotics, and Smart Infrastructure | |
| DOI | https://doi.org/10.1051/itmconf/20257803023 | |
| Published online | 08 September 2025 | |
Advances and Challenges in Machine Learning-Based Cardinality Estimation for Database Query Optimization
School of Software Engineering, Sichuan University, Chengdu, China
Accurate cardinality estimation is critical for optimizing database queries, yet traditional methods often fail to provide reliable predictions in the face of complex queries, skewed data distributions, and high-dimensional schemas. As data volumes and query complexity grow, more robust and adaptive estimation techniques have become essential for maintaining efficient query performance. This paper surveys recent advancements in machine learning-based cardinality estimation methods, categorizing them into three main types: query-driven, data-driven, and hybrid models. Each approach is analyzed in terms of its model architecture, training strategies, and predictive performance. Notable techniques include PostCENN’s integration into PostgreSQL, FACE’s use of normalizing flows, and UAE’s autoregressive learning across data and query distributions. Comparative evaluations highlight how these models address specific challenges in cardinality estimation. The results show that while machine learning methods significantly reduce estimation errors, they also share limitations such as sensitivity to workload drift, scalability challenges with large schemas, and poor generalization in long-tail query regions. To address these, this paper proposed future directions including Bayesian updating, sparse factor graph modeling, and active query synthesis. These innovations hold promise for building more accurate, scalable, and adaptive estimators that can enhance database system performance under diverse and dynamic workloads.
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

