Software Fault Prediction Using Cross Project Analysis A Study on Class Imbalance and Model Generalization
DOI:
https://doi.org/10.46647/rdems0205006Keywords:
Software Fault Prediction, Cross Project Analysis, Class Imbalance, Machine Learning, Model Generalization, Software Quality, Transfer Learning.Abstract
Software fault prediction plays a vital role in improving software quality by identifying defective modules early in the development lifecycle. Traditional models often rely on within-project data, limiting their generalization capability across different projects. This study focuses on cross-project fault prediction, addressing two major challenges: class imbalance and model generalization. Class imbalance, where faulty modules are significantly fewer than non-faulty ones, negatively impacts prediction performance. The proposed approach integrates data balancing techniques and machine learning models to enhance prediction accuracy across diverse datasets. Furthermore, transfer learning and normalization strategies are utilized to improve model generalization. The framework demonstrates improved predictive performance compared to traditional approaches. This research contributes to scalable, cost-effective, and reliable fault prediction systems applicable across heterogeneous software environments.