Issue |
ITM Web Conf.
Volume 67, 2024
The 19th IMT-GT International Conference on Mathematics, Statistics and Their Applications (ICMSA 2024)
|
|
---|---|---|
Article Number | 01012 | |
Number of page(s) | 12 | |
Section | Mathematics, Statistics and Their Applications | |
DOI | https://doi.org/10.1051/itmconf/20246701012 | |
Published online | 21 August 2024 |
Multi-Class Imbalance Classification of Diabetes Cases Using Light Gradient Boosting Machine
1 Department of Mathematics, Universitas Gadjah Mada, Sleman, Yogyakarta, Indonesia
2 Department of Statistics, Universitas Muhammadiyah Semarang, Semarang, Central of Java, Indonesia
* Corresponding author: dedirosadi@ugm.ac.id
Diabetes is the third leading cause of death in Indonesia. Diabetes is considered a silent killer because it kills slowly and triggers various complications of chronic diseases in the body of the sufferer. Early detection of diabetes is very important to reduce the risk of more serious health problems and reduce the country's socio-economic losses in diabetes management. Machine learning classification is an alternative method that can be used for early detection of diabetes by predicting category labels from observed data. This study aims to classify diabetes using the Light Gradient Boosting Machine (LGBM) method with Synthetic Minority Oversampling Technique of Nominal and Continuous (SMOTENC). The SMOTENC oversampling method is used to handle the imbalance problem in the dataset used, while the LGBM method is used for multi-class classification of diabetes. The results showed that by applying the SMOTENC technique, a more balanced data distribution was obtained, so that when used in the classification process using LGBM, it resulted in high model performance. Based on the confusion matrix, the accuracy value is 90%.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.