Karol Capała, Paulina Tworek, Jose Sousa
In many fields—including healthcare and biomedical sciences—machine learning is increasingly used to support critical decision-making. But how reliable are these models when data is scarce or incomplete?
Autors investigate this issue by examining the stability of predictive features in machine learning models trained on limited datasets. Their study compares conventional ML approaches with a previously introduced method that leverages data abstractions to enhance learning under imperfect conditions.
The results highlight that the abstraction-based approach not only maintains strong classification performance but also ensures greater consistency in feature selection—even as data availability decreases. This work demonstrates that machine learning systems can be designed to remain interpretable and robust, even in the face of data scarcity, bringing us closer to safe and autonomous AI-based decision-making in complex domains.
Authors: Karol Capała, Paulina Tworek, Jose Sousa
DOI: 10.1109/TKDE.2025.3580671
Keywords: feature stability, Classification, data abstractions, limited data, explainability, machine learning, predictions.
READ HERE