Karol Capała, Paulina Tworek, Jose Sousa

As machine learning (ML) becomes an increasingly trusted tool for forecasting critical phenomena—especially in areas like healthcare—it is essential that the systems behind these predictions provide not only accuracy, but also interpretability, stability, and contextual relevance. Traditional ML models often depend on complete, well-structured datasets, assuming similar distributions between training and testing phases. In contrast, human reasoning thrives even when working with incomplete or noisy information by leveraging abstract representations and generalizations.
This study explores how machine learning can emulate that human ability. It compares classical ML approaches with a novel methodology that incorporates data abstraction to enhance feature relevance and robustness. The experimental results highlight that this abstraction-driven, descriptive ML technique preserves classification performance and offers more stable feature selection—even as datasets become increasingly sparse or inconsistent. These findings support the development of machine learning systems that can make autonomous decisions effectively, even in real-world conditions where data is imperfect or limited.

Autors: Karol CapałaPaulina TworekJose Sousa

DOI: 10.1109/TKDE.2025.3580671

READ HERE