Choosing the Right Metrics: A Study of Performance Measurement for Binary Classification in Imbalanced and Big Data
DOI:
https://doi.org/10.32473/flairs.38.1.139140Abstract
There is not a general consensus as to which performance metrics provide more reliable and informative results compared to others. While there are studies which investigate and compare different metrics, they are typically focused on the performance of a classifier, and do not provide a clear understanding as to the specific relationships between metrics, nor their reliability in different settings (such as highly imbalanced datasets). This study examines the underlying relationships among 17 commonly used performance metrics and their suitability for datasets of varying sizes and class distribution levels, using factor analysis to uncover latent factors. We analyzed 23 publicly available datasets from diverse domains, ranging in size from 309 to over five million instances and distribution levels from 0.17% to 44.87%, using two gradient boosting algorithms, LightGBM and XGBoost, and one unsupervised anomaly detection algorithm, Isolation Forest. Factor analysis was used to group the metrics into distinct latent factors, enabling a framework for researchers to select appropriate metrics and avoid redundant or misleading ones based on dataset characteristics.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mary Anne Walauskis, Taghi M. Khoshgoftaar

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.