Memory Hierarchy Optimization and Cache Aware Signal Processing Pipelines for Next Generation High Throughput Computing Architectures
DOI:
https://doi.org/10.66472/casp.v1i1.32Keywords:
Signal Processing, Memory Hierarchies, Throughput Improvement, Pipeline Design, Cache Aware OptimizationsAbstract
This research explores the impact of Cache Aware optimizations on signal processing pipelines in High Throughput computing systems. The growing demand for efficient memory management in modern computing systems, especially for data-intensive applications such as artificial intelligence (AI) and multimedia processing, necessitates the development of optimized memory hierarchies. Traditional memory systems often suffer from memory bottlenecks, significantly reducing the performance of these systems. This study investigates how memory hierarchy optimizations, particularly cache line aware optimization, dependency-aware caching, and adaptive cache replacement algorithms, can mitigate these challenges and improve system performance. Through analytical modeling and experimental benchmarking, this work evaluates various memory hierarchy configurations, including processing-in-memory (PIM) and three-dimensional integrated circuits (3D ICs), comparing them to conventional systems. The results demonstrate that Cache Aware optimizations lead to a reduction in memory access latency by up to 30%, while throughput improved by up to 40%. Additionally, cache hit rates increased by 25%, and energy consumption was reduced by up to 20%, highlighting the effectiveness of optimized memory management. The research contributes to the field by providing valuable insights into the design and implementation of efficient signal processing pipelines. It also identifies key challenges, including the need for dynamic occupancy mechanisms and DAG-aware scheduling algorithms, and suggests potential areas for future research, such as the exploration of collaborative caching approaches and further optimization of cache-adaptive algorithms. This work lays the foundation for more efficient, high-performance computing systems that can handle large datasets and complex tasks in real-time applications.
References
[1] G. Zhang, Z. Song, W. Zhang, X. Chen, S. Huang, and Y. Dong, “Survey of storage systems in high performance computing,” CCF Trans. High Perform. Comput., 2025, doi: 10.1007/s42514-025-00268-5.
[2] X. Zou, S. Xu, X. Chen, L. Yan, and Y. Han, “Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology,” Sci. China Inf. Sci., vol. 64, no. 6, 2021, doi: 10.1007/s11432-020-3227-1.
[3] K. S. Mohamed, “Analyzing the Trade-off Between Different Memory Cores and Controllers,” Analog Circuits Signal Process., pp. 51 – 76, 2016, doi: 10.1007/978-3-319-22035-2_3.
[4] D. Danang and Z. Mustofa, “CLSTMNet Architecture: A CNN–LSTM-Based Hybrid Deep Learning Model for DDoS Attack Detection and Mitigation in Network Security,” J. Artif. Intell. Technol., 2026.
[5] E. Siswanto, D. Danang, I. Kusumaningroem, and I. Akhsani, “Assessing Software Architecture Resilience Using Quantitative Metrics in Cloud Native Application Development Environments,” Indones. J. Infomatics, vol. 1, no. 1, pp. 11–21, 2026.
[6] W. Li et al., “MACT: Discrete memory access requests batch processing mechanism for high-throughput many-core processor,” Jisuanji Yanjiu yu Fazhan/Computer Res. Dev., vol. 52, no. 6, pp. 1254 – 1265, 2015, doi: 10.7544/issn1000-1239.2015.20150154.
[7] V. Y. Raparti and S. Pasricha, “Approximate NoC and Memory Controller Architectures for GPGPU Accelerators,” IEEE Trans. Parallel Distrib. Syst., vol. 31, no. 5, pp. 25 – 39, 2020, doi: 10.1109/TPDS.2019.2958344.
[8] D. Danang and Z. Mustofa, “Digital Forensics and Automated Incident Response Framework Leveraging Big Data Analytics and Real Time Network Traffic Profiling in Heterogeneous Cyber Environments,” Cyber Secur. Netw. Manag., vol. 1, no. 1, pp. 44–45, 2026.
[9] Danang, T. Wahyono, I. Sembiring, T. Wellem, and N. H. Dzulkefly, “An Adaptive Framework Integrating ML Blockchain and TEE for Cloud Security,” in Proceeding - 2025 4th International Conference on Creative Communication and Innovative Technology: Empowering Transformative MATURE LEADERSHIP: Harnessing Technological Advancement for Global Sustainability, ICCIT 2025, 2025. doi: 10.1109/ICCIT65724.2025.11167152.
[10] R. Kaplan, “From Processing-in-Memory to Processing-in-Storage,” in Parallel Architectures and Compilation Techniques (PACT), 2016, p. 453. doi: 10.1145/2967938.2971463.
[11] Z. Wang, Y. Zhai, W. Tao, H. Yang, H. Zhang, and D. Qing, “Research on Technology of Data Storage and Access in High-throughput Simulation,” Xitong Fangzhen Xuebao / J. Syst. Simul., vol. 29, no. 9, pp. 2016 – 2024, 2017, doi: 10.16182/j.issn1004731x.joss.201709019.
[12] C.-J. Jhang, P.-C. Chen, and M.-F. Chang, “Challenges of computation-in-memory circuits for AI edge applications,” in VLSI-TSA 2021 - 2021 International Symposium on VLSI Technology, Systems and Applications, Proceedings, 2021. doi: 10.1109/VLSI-TSA51926.2021.9440045.
[13] M. E. Fouda, H. E. Yantir, A. M. Eltawil, and F. Kurdahi, “In-Memory Associative Processors: Tutorial, Potential, and Challenges,” IEEE Trans. Circuits Syst. II Express Briefs, vol. 69, no. 6, pp. 2641 – 2647, 2022, doi: 10.1109/TCSII.2022.3170468.
[14] H. Mao, J. Shu, F. Li, and Z. Liu, “Development of Processing-in-Memory,” Sci. Sin. Informationis, vol. 51, no. 2, pp. 173–205, 2021, doi: 10.1360/SSI-2020-0037.
[15] H. E. Yantir, A. M. Eltawil, and K. N. Salama, “An Efficient 2D Discrete Cosine Transform Processor for Multimedia Applications,” in 2020 28th Signal Processing and Communications Applications Conference, SIU 2020 - Proceedings, 2020. doi: 10.1109/SIU49456.2020.9302059.
[16] V. T. K. Gannavaram and A. K. Gajula, “Performance Analysis of 3D Stacked Memory Architectures in High Performance Computing,” in 2024 4th International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE 2024, 2024, pp. 1634 – 1637. doi: 10.1109/ICACITE60783.2024.10616405.
[17] D. Danang, E. Siswanto, N. D. Setiawan, and P. Wibowo, “Hybrid Zero Trust Container Based Model for Proactive Service Continuity under Intelligent DDoS Attacks in Cloud Environment,” Int. J. Comput. Technol. Sci., vol. 2, no. 3, pp. 41–49, 2025, doi: https://doi.org/10.62951/ijcts.v2i3.291.
[18] J. G. Wingbermuehle, R. K. Cytron, and R. D. Chamberlain, “Superoptimized memory subsystems for streaming applications,” in FPGA 2015 - 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015, pp. 126 – 135. doi: 10.1145/2684746.2689069.
[19] A. L. Damasceno, S. R. Fernandes, and G. G. B. Silva, “Impact Analysis on a Memory Hierarchy Applied to IPNoSys Architecture,” IEEE Lat. Am. Trans., vol. 15, no. 4, pp. 619–625, 2017, doi: 10.1109/TLA.2017.7896346.
[20] H.-Y. Tseng, S.-T. Liu, and S.-D. Wang, “An FPGA memory hierarchy for high-level synthesized OpenCL kernels,” in Proceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015, 2015, pp. 1719 – 1724. doi: 10.1109/HPCC-CSS-ICESS.2015.210.
[21] Y. Yan, R. Brightwell, and X.-H. Sun, “Principles of memory-centric programming for high performance computing,” in Proceedings of MCHPC 2017: Workshop on Memory Centric Programming for HPC - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 2 – 6. doi: 10.1145/3145617.3158212.
[22] K. Hoya, K. Hatsuda, K. Tsuchida, Y. Watanabe, Y. Shirota, and T. Kanai, “A perspective on NVRAM technology for future computing system,” in 2019 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2019, 2019. doi: 10.1109/VLSI-DAT.2019.8741675.
[23] S. Qiao, “A comparative analysis of mathematical transformations for signal processing,” in Proceedings of SPIE - The International Society for Optical Engineering, 2023. doi: 10.1117/12.2673879.
[24] S. Ramos and T. Hoefler, “Cache Line Aware Algorithm Design for Cache-Coherent Architectures,” IEEE Trans. Parallel Distrib. Syst., vol. 27, no. 10, pp. 2824 – 2837, 2016, doi: 10.1109/TPDS.2016.2516540.
[25] Z. Zhao, H. Zhang, X. Geng, and H. Ma, “Resource-aware cache management for in-memory data analytics frameworks,” in Proceedings - 2019 IEEE Intl Conf on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking, ISPA/BDCloud/SustainCom/SocialCom 2019, 2019, pp. 364 – 371. doi: 10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00060.
[26] Y. Zhao, J. Dong, H. Liu, J. Wu, and Y. Liu, “Performance improvement of dag-aware task scheduling algorithms with efficient cache management in spark,” Electron., vol. 10, no. 16, 2021, doi: 10.3390/electronics10161874.
[27] D. Yan, L. Yuan, A. Ahmad, and S. Adhikari, “Systems for Scalable Graph Analytics and Machine Learning: Trends and Methods,” in International Conference on Information and Knowledge Management, Proceedings, 2024, pp. 5547 – 5550. doi: 10.1145/3627673.3679101.
[28] A. Lincoln, J. Lynch, Q. C. Liu, and H. Xu, “Cache-adaptive exploration: Experimental results and scan-hiding for adaptivity,” in Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2018, pp. 213 – 222. doi: 10.1145/3210377.3210382.
[29] M. A. Bender et al., “Closing the Gap between Cache-oblivious and Cache-adaptive Analysis,” in Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2020, pp. 63 – 73. doi: 10.1145/3350755.3400274.
[30] D. Danang, I. A. Dianta, A. B. Santoso, and S. Kholifah, “Hybrid CNN GRU Framework for Early Detection and Adaptive Mitigation of DDoS Attacks in SDN using Image Based Traffic Analysis,” Int. J. Inf. Eng. Sci., vol. 2, no. 2, pp. 66–78, 2025, doi: https://doi.org/10.62951/ijies.v2i2.292.
[31] D. Danang, A. B. Santoso, and M. U. Dewi, “CICA Framework: Harnessing CSR, AI, and Blockchain for Sustainable Digital Culture,” Int. J. Adv. Comput. Sci. & Appl., vol. 16, no. 11, 2025.
[32] D. Danang, M. U. Dewi, and W. Aryani, “Systematic Literature Review on the Application of Blockchain in Enhancing Server Security: Research Methods for Mitigating Ransomware and Malware Attacks,” Int. J. Comput. Technol. Sci., vol. 1, no. 4, pp. 27–51, 2024, doi: https://doi.org/10.62951/ijcts.v1i4.186.
[33] D. Danang, N. D. Setiawan, and E. Siswanto, “Pemanfaatan Teknologi Internet of Things untuk Monitoring Kualitas Air Sungai di Wilayah Perkotaan,” J. New Trends Sci., vol. 2, no. 1, pp. 23–34, 2024.
[34] E. Muhadi, S. Sulartopo, D. Danang, D. Sasmoko, and N. D. Setiawan, “Rancang Bangun Sistem Keamanan Ruang Persandian Menggunakan RFID dan Sensor PIR Berbasis IOT,” Router J. Tek. Inform. dan Terap., vol. 2, no. 1, pp. 8–20, 2024.
[35] M. K. Umam, D. Danang, E. Siswanto, and N. D. Setiawan, “Rancangan Bangun Otomasi Air Suling Daun Cengkeh Berbasis Arduino,” Repeater Publ. Tek. Inform. dan Jar., vol. 2, no. 2, pp. 1–10, 2024.
[36] I. Englishtina, H. R. D. Putranti, D. Danang, and A. A. B. Pujiati, “SITENAR CERYA as an Innovation in English Language Learning at SMP Stella Matutina Salatiga: Merging Technology and Folktales,” REKA ELKOMIKA J. Pengabdi. Kpd. Masy., vol. 5, no. 3, pp. 241–250, 2024.
[37] H. R. D. Putranti, D. Danang, T. M. F. B. Da Silva, and A. A. B. Pujiati, “Integrating Hands-on and Virtual Learning for Environmental Sustainability: Eco Enzyme Soap Making at Stella Matutina,” REKA ELKOMIKA J. Pengabdi. Kpd. Masy., vol. 6, no. 1, pp. 88–97, 2025.
[38] H. R. Putranti, R. Retnowati, A. A. Sihombing, and D. Danang, “Performance Assessment through Work Gamification: Investigating Engagement,” South African J. Bus. Manag., vol. 55, no. 1, pp. 1–12, 2024.


