A Comparative Study of Software Testing Techniques and Quality Metrics for Predicting Failure Rates in Scalable Cloud Native Software Systems
Keywords:
Chaos Testing, Cloud-Native Systems, Failure Prediction, Quality Metrics, Testing TechniquesAbstract
Cloud-native systems are essential for modern software development, offering enhanced scalability, flexibility, and resilience through cloud computing environments. However, ensuring the reliability and performance of these systems presents a challenge due to their dynamic and distributed nature. Traditional testing methods, such as unit and integration testing, while valuable for detecting individual component defects and interactions, are insufficient for predicting failure rates in complex, cloud-native applications. This study explores the effectiveness of various testing techniques and quality metrics in predicting failure rates within scalable cloud-native systems. A comparative experimental study was conducted using three primary testing techniques: unit testing, integration testing, and chaos testing. The results indicate that chaos testing, when combined with advanced quality metrics such as migration rate and mismigration rate, significantly outperforms traditional methods in predicting failure rates and evaluating system resilience. These findings suggest that chaos testing offers a more comprehensive evaluation, simulating real-world disruptions to test system behavior under stress, which is essential for cloud-native environments where high availability and fault tolerance are critical. The study also highlights the importance of integrating predictive quality metrics, which improve the accuracy of failure predictions and enhance system reliability. The study concludes that for cloud-native systems, a combination of advanced testing techniques and predictive metrics is essential for ensuring high availability, scalability, and reliability in dynamic environments. Future research should focus on refining predictive testing approaches, developing standardized frameworks, and empirically validating new testing methods to address the growing complexity of cloud-native systems.
References
[1] M. Swathi Sree, C. Kishor Kumar Reddy, K. Vaishnavi, and V. Harika, “AI-powered software engineering for cloud-native environments,” 2025. doi: 10.4018/979-8-3693-9356-7.ch003.
[2] F. A. Silva, F. A. M. Trinta, M. S. Bonfim, J. A. F. de Macedo, P. A. L. Rego, and V. Lagrota, “Performance Evaluation of Cloud Native Applications: A Systematic Mapping Study,” J. Netw. Syst. Manag., vol. 33, no. 4, 2025, doi: 10.1007/s10922-025-09937-w.
[3] O. Pozdniakova, D. Mažeika, and A. Cholomskis, “Adaptive Resource Provisioning and Auto-scaling for Cloud Native Software,” Commun. Comput. Inf. Sci., vol. 920, pp. 113 – 129, 2018, doi: 10.1007/978-3-319-99972-2_9.
[4] P. Vaghasia, A. Goswami, D. Patel, R. Patel, R. Patel, and R. Vaghasia, “Enhancing Data Processing Speed and Efficiency through Cloud-Native Data Analytics Platforms,” in 2025 International Conference on Computing Technologies, ICOCT 2025, 2025. doi: 10.1109/ICOCT64433.2025.11118816.
[5] Y. Kim, C. Park, and Y. Shin, “Security Consideration of Each Layers in a Cloud-Native Environment,” Commun. Comput. Inf. Sci., vol. 1644 CCIS, pp. 231 – 242, 2023, doi: 10.1007/978-981-99-4430-9_17.
[6] A. Senthuran and S. Hettiarachchi, “A Review of Dynamic Scalability and Dynamic Scheduling in Cloud-Native Distributed Stream Processing Systems,” Lect. Notes Electr. Eng., vol. 601, pp. 1539 – 1553, 2020, doi: 10.1007/978-981-15-1420-3_161.
[7] R. Vaño, I. Lacalle, P. Sowiński, R. S-Julián, and C. E. Palau, “Cloud-Native Workload Orchestration at the Edge: A Deployment Review and Future Directions,” Sensors, vol. 23, no. 4, 2023, doi: 10.3390/s23042215.
[8] P. Soumplis, G. Kontos, A. Kretsis, P. Kokkinos, A. Nanos, and E. Varvarigos, “Security-Aware Resource Allocation in the Edge-Cloud Continuum,” in 2023 IEEE 12th International Conference on Cloud Networking, CloudNet 2023, 2023, pp. 161 – 169. doi: 10.1109/CloudNet59005.2023.10490073.
[9] Q. Zeng, M. Kavousi, Y. Luo, L. Jin, and Y. Chen, “Full-stack vulnerability analysis of the cloud-native platform,” Comput. Secur., vol. 129, 2023, doi: 10.1016/j.cose.2023.103173.
[10] T. Islam and D. Manivannan, “Predicting Application Failure in Cloud: A Machine Learning Approach,” in Proceedings - 2017 IEEE 1st International Conference on Cognitive Computing, ICCC 2017, 2017, pp. 24 – 31. doi: 10.1109/IEEE.ICCC.2017.11.
[11] J. Shetty, R. Sajjan, and G. Shobha, “Task resource usage analysis and failure prediction in cloud,” in Proceedings of the 9th International Conference On Cloud Computing, Data Science and Engineering, Confluence 2019, 2019, pp. 342 – 348. doi: 10.1109/CONFLUENCE.2019.8776612.
[12] K. Vani and S. Sujatha, “A Machine Learning Framework for Job Failure Prediction in Cloud using Hyper Parameter Tuned MLP,” in 2nd IEEE International Conference on Advanced Technologies in Intelligent Control, Environment, Computing and Communication Engineering, ICATIECE 2022, 2022. doi: 10.1109/ICATIECE56365.2022.10047809.
[13] D. Gannon, R. Barga, and N. Sundaresan, “Cloud-Native Applications,” IEEE Cloud Comput., vol. 4, no. 5, pp. 16 – 21, 2017, doi: 10.1109/MCC.2017.4250939.
[14] A.-I. Tuns and A. Spataru, “Cloud Service Failure Prediction on Google’s Borg Cluster Traces Using Traditional Machine Learning,” in Proceedings - 2023 25th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2023, 2023, pp. 162 – 169. doi: 10.1109/SYNASC61333.2023.00029.
[15] L. Mwafaq, N. K. Meftin, E. E. Asanovich, M. K. H. Al-Dulaimi, and T. K. Hasan, “Cloud-Native Architectures: Transforming Enterprise IT Operations,” Iran. J. Inf. Process. Manag., vol. 40, pp. 259 – 286, 2025, doi: 10.22034/jipm.2025.728114.
[16] N. A. Davis, A. Rezgui, H. Soliman, S. Manzanares, and M. Coates, “FailureSim: A System for Predicting Hardware Failures in Cloud Data Centers Using Neural Networks,” in IEEE International Conference on Cloud Computing, CLOUD, 2017, pp. 544 – 551. doi: 10.1109/CLOUD.2017.75.
[17] J. Rahme and H. Xu, “Preventive maintenance for cloud-based software systems subject to non-constant failure rates,” in 2017 IEEE SmartWorld Ubiquitous Intelligence and Computing, Advanced and Trusted Computed, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation, SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI 2017 - Conference Proceedings, 2018, pp. 1 – 6. doi: 10.1109/UIC-ATC.2017.8397631.
[18] J. Li, R. J. Stones, G. Wang, Z. Li, X. Liu, and K. Xiao, “Being Accurate Is Not Enough: New Metrics for Disk Failure Prediction,” in Proceedings of the IEEE Symposium on Reliable Distributed Systems, 2016, pp. 71 – 80. doi: 10.1109/SRDS.2016.019.
[19] J. Li, R. J. Stones, G. Wang, Z. Li, X. Liu, and J. Ding, “New Metrics for Disk Failure Prediction That Go Beyond Prediction Accuracy,” IEEE Access, vol. 6, pp. 76627 – 76639, 2018, doi: 10.1109/ACCESS.2018.2884004.
[20] T. R. Chhetri, C. K. Dehury, A. Lind, S. N. Srirama, and A. Fensel, “A Combined System Metrics Approach to Cloud Service Reliability Using Artificial Intelligence,” Big Data Cogn. Comput., vol. 6, no. 1, 2022, doi: 10.3390/bdcc6010026.
[21] R. Lichtenthäler, J. Fritzsch, and G. Wirtz, “Cloud-Native Architectural Characteristics and their Impacts on Software Quality: A Validation Survey,” in Proceedings - 17th IEEE International Conference on Service-Oriented System Engineering, SOSE 2023, 2023, pp. 9 – 18. doi: 10.1109/SOSE58276.2023.00008.
[22] P. Harsh et al., “Cloud enablers for testing large-scale distributed applications,” in UCC 2019 Companion - Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing, 2019, pp. 35 – 42. doi: 10.1145/3368235.3368838.
[23] Q. Wang and K. Wei, “Practical Implementation of Precise Testing in the Cloud-Native Era,” in 2023 4th International Symposium on Computer Engineering and Intelligent Communications, ISCEIC 2023, 2023, pp. 117 – 121. doi: 10.1109/ISCEIC59030.2023.10271225.
[24] N. A. Natraj, B. Sundaravadivazhagan, C. Pethururaj, and S. Bhavani, “AI-driven testing for cloud-native architectures: From implementation challenges to integrated platforms,” Adv. Comput., 2025, doi: 10.1016/bs.adcom.2025.07.006.
[25] S. Pushpanjali and J. Anubha, “A Model for Effective Software Testing in Cloud Environment,” Adv. Intell. Syst. Comput., vol. 1187, pp. 145 – 153, 2021, doi: 10.1007/978-981-15-6014-9_17.
[26] L. F. H. Lopez, M. G. Martinez, and A. E. Bedoya, “A Suite of Metrics for Evaluating Client-Side web Applications: An Empirical Validation,” in Proceedings - 2020 46th Latin American Computing Conference, CLEI 2020, 2020, pp. 138 – 146. doi: 10.1109/CLEI52000.2020.00023.
[27] D. Al-Janabi, M. F. Enache, and T. S. Letia, “The Quality of Programs Conceived by Object Enhanced Time Petri Nets,” in 7th International Conference on Control, Decision and Information Technologies, CoDIT 2020, 2020, pp. 768 – 773. doi: 10.1109/CoDIT49905.2020.9263964.
[28] K. Kushala, V. Induru, Y. K. Kolli, V. A. Ramar, P. Radhakrishnan, and R. Pushpakumar, AI- Driven Software Testing and Validation in Cloud Computing: Automated Test Frameworks for Intelligent Systems. 2025. doi: 10.4018/979-8-3373-7503-8.ch008.
[29] X. Guerron, S. Abrahao, E. Insfran, M. Fernandez-Diego, and F. Gonzalez-Ladron-De-Guevara, “A Taxonomy of Quality Metrics for Cloud Services,” IEEE Access, vol. 8, pp. 131461 – 131498, 2020, doi: 10.1109/ACCESS.2020.3009079.
[30] H. M. Khan, F.-F. Chua, and T. T. V. Yap, “A Review on Quality of Service Monitoring, Violation and Remediation for the Cloud,” J. Syst. Manag. Sci., vol. 13, no. 5, pp. 107 – 126, 2023, doi: 10.33168/JSMS.2023.0507.
[31] S. Werner, M. C. Borges, K. Wolf, and S. Tai, “A Comprehensive Experimentation Framework for Energy-Efficient Design of Cloud-Native Applications,” in Proceedings - 2025 IEEE 22nd International Conference on Software Architecture, ICSA 2025, 2025, pp. 176 – 186. doi: 10.1109/ICSA65012.2025.00026.
[32] C. Hung Kao, “Testing and evaluation methods for cloud environments: A review,” in ACM International Conference Proceeding Series, 2017, pp. 56–60. doi: 10.1145/3108421.3108435.
[33] D. Danang, T. Wahyono, I. Sembiring, T. Wellem, and N. H. Dzulkefly, “An Adaptive Framework Integrating ML Blockchain and TEE for Cloud Security,” in 2025 4th International Conference on Creative Communication and Innovative Technology (ICCIT), IEEE, 2025, pp. 1–7.
[34] E. Muhadi, S. Sulartopo, D. Danang, D. Sasmoko, and N. D. Setiawan, “Rancang bangun sistem keamanan ruang persandian menggunakan RFID dan sensor PIR berbasis IoT,” Router J. Tek. Inform. dan Terap., vol. 2, no. 1, pp. 8–20, 2024.
[35] M. K. Umam, D. Danang, E. Siswanto, and N. D. Setiawan, “Rancangan Bangun Otomasi Air Suling Daun Cengkeh Berbasis Arduino,” Repeater Publ. Tek. Inform. dan Jar., vol. 2, no. 2, pp. 1–10, 2024.
[36] H. R. Putranti, R. Retnowati, A. A. Sihombing, and D. Danang, “Performance assessment through work gamification: Investigating engagement,” South African J. Bus. Manag., vol. 55, no. 1, pp. 1–12, 2024.
[37] V. Saklamaeva, T. Beranič, and L. Pavlič, “An Initial Insight into Measuring Quality in Cloud-Native Architectures,” Commun. Comput. Inf. Sci., vol. 2152 CCIS, pp. 341 – 351, 2024, doi: 10.1007/978-3-031-63269-3_26.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Software Engineering in Computing Systems

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


