A Hybrid Data Structure and Algorithmic Approach for Efficient Memory Management and Query Processing in High Performance Software Systems

Zulfikar Zulfikar; Febri Adi Prasetya; Marsiska Ariesta Putri

doi:10.66472/paf.v1i1.20

Authors

Zulfikar Zulfikar Politeknik Kampar
Febri Adi Prasetya Universitas Sains dan Teknologi Komputer
Marsiska Ariesta Putri Institut Teknologi dan Bisnis Semarang

DOI:

https://doi.org/10.66472/paf.v1i1.20

Keywords:

Hybrid Data Structure, Memory Efficiency, Query Performance, High-Performance Computing, Data Management

Abstract

In high-performance computing (HPC) environments, the need to balance memory efficiency and query performance is crucial for ensuring optimal system performance. Traditional data structures, such as B-trees and hash tables, often prioritize either memory usage or query speed, leading to suboptimal performance in memory-constrained systems. This paper proposes a hybrid data structure that combines the strengths of multiple traditional data structures to optimize both memory usage and query processing speed. The proposed hybrid structure integrates cache-conscious algorithms, dynamic memory allocation, and compression techniques for intermediate query results. The approach is evaluated through extensive benchmarking tests comparing it to standard data structures like B-trees and hash tables under various workloads. Results show that the hybrid data structure reduces memory overhead by up to 30% while maintaining query processing speeds up to 1.5 times faster than conventional methods. Furthermore, the hybrid structure demonstrates robust performance across different types of queries, including both point and range queries, ensuring versatility and efficiency. The findings indicate that this hybrid approach provides a promising solution for HPC systems, where both memory efficiency and query speed are essential. Future research can explore extending the hybrid structure to distributed systems and emerging technologies, further improving its scalability and adaptability to new computational paradigms.

References

[1] A. Ailamaki, “Databases and hardware: The beginning and sequel of a beautiful friendship,” Proc. VLDB Endow., vol. 8, no. 12 12, pp. 2058 – 2061, 2015, doi: 10.14778/2824032.2824142.

[2] S.-H. Kim et al., “A 24Gb 42.5Gb/s GDDR7 DRAM with Low-Power WCK Distribution, an RC-Optimized Dual-Emphasis TX, and Voltage/Time-Margin-Enhanced Power Reduction,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2025, pp. 508 – 510. doi: 10.1109/ISSCC49661.2025.10904689.

[3] G. More, S. Ray, and K. B. Kent, “Reconfigurable acceleration for database systems: Taxonomy, techniques, and research challenges,” J. Syst. Archit., vol. 171, 2026, doi: 10.1016/j.sysarc.2025.103659.

[4] M. Lee, M. Lee, and C. Kim, “A JIT compilation-based unified SQL query optimization system,” in 2016 6th International Conference on IT Convergence and Security, ICITCS 2016, 2016. doi: 10.1109/ICITCS.2016.7740304.

[5] C.-H. Ma, X.-L. Hao, X.-F. Meng, and X.-K. Zhang, “Survey on Machine Learning for Multi-Dimensional Data Query Processing; [机器学习赋能的多维数据查询处理研究综述],” Jisuanji Xuebao/Chinese J. Comput., vol. 48, no. 1, pp. 100 – 123, 2025, doi: 10.11897/SP.J.1016.2025.00100.

[6] K. Krishan, G. Gupta, and G. S. Bhathal, “Query load management: an approach for optimizing database performance,” OPSEARCH, 2025, doi: 10.1007/s12597-025-01012-x.

[7] P. Damme, “Query processing based on compressed intermediates,” in CEUR Workshop Proceedings, 2017. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85027850036&partnerID=40&md5=35b18c2bf237f937ed7c3e0164d2cac5

[8] J. Choe, A. Crotty, T. Moreshet, M. Herlihy, and R. I. Bahar, “HybriDS: Cache-Conscious Concurrent Data Structures for Near-Memory Processing Architectures,” in Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2022, pp. 321 – 332. doi: 10.1145/3490148.3538591.

[9] Z. Choudhury, S. Purini, and S. R. Krishna, “A hybrid CPU+GPU working-set dictionary,” in Proceedings - 15th International Symposium on Parallel and Distributed Computing, ISPDC 2016, 2017, pp. 56 – 63. doi: 10.1109/ISPDC.2016.16.

[10] Y. Wang and K. Li, Efficient data allocation for PCM in processing-in-memory systems. 2022. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85152096171&partnerID=40&md5=930ba9932d8f3cc53eb4a50180b884d1

[11] A. Yu, Q. Meng, X. Zhou, B. Shen, and Y. Zhang, “Query optimization on hybrid storage,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10177 LNCS, pp. 361 – 375, 2017, doi: 10.1007/978-3-319-55753-3_23.

[12] S. Zhang, J. Qi, X. Yao, and A. Brinkmann, “Hyper: A High-Performance and Memory-Efficient Learned Index via Hybrid Construction,” Ann. Entomol. Soc. Am., vol. 2, no. 3, 2024, doi: 10.1145/3654948.

[13] C. Guo and H. Chen, “In-memory join algorithms on GPUs for large-data,” in Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, 2019, pp. 1060 – 1067. doi: 10.1109/HPCC/SmartCity/DSS.2019.00151.

[14] H. Zhang, D. G. Andersen, A. Pavlo, M. Kaminsky, L. Ma, and R. Shen, “Reducing the storage overhead of main-memory OLTP databases with hybrid indexes,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2016, pp. 1567 – 1581. doi: 10.1145/2882903.2915222.

[15] F. Faerber, A. Kemper, P.-Å. Larson, J. Levandoski, T. Neumann, and A. Pavlo, “Main memory database systems,” Found. Trends Databases, vol. 8, no. 1–2, pp. 1 – 130, 2017, doi: 10.1561/1900000058.

[16] P. Memarzia, S. Ray, and V. C. Bhavsar, “The art of efficient in-memory query processing on NUMA systems: A systematic approach,” in Proceedings - International Conference on Data Engineering, 2020, pp. 781 – 792. doi: 10.1109/ICDE48307.2020.00073.

[17] S. Floratos et al., “NestGPU: Nested query processing on GPU,” in Proceedings - International Conference on Data Engineering, 2021, pp. 1008 – 1019. doi: 10.1109/ICDE51399.2021.00092.

[18] S. A. Aula and T. A. Rashid, “FOX-TSA: Navigating Complex Search Spaces and Superior Performance in Benchmark and Real-World Optimization Problems,” Ain Shams Eng. J., vol. 16, no. 1, 2025, doi: 10.1016/j.asej.2024.103185.

[19] T. Lale, “GWO-WOA-AOA: Multistage Hybrid Metaheuristic Optimization Approach,” in ISAS 2025 - 9th International Symposium on Innovative Approaches in Smart Technologies, Proceedings, 2025. doi: 10.1109/ISAS66241.2025.11101906.

[20] W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann, “High-speed query processing over highspeed networks,” in Proceedings of the VLDB Endowment, 2016, pp. 228 – 239. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84976478057&partnerID=40&md5=3fc3572973fe8bfc88879a54e372a853

[21] P. P. Jashma Suresh, U. Dinesh Acharya, and N. V. S. Reddy, “Mining frequent itemsets from transaction databases using hybrid switching framework,” Multimed. Tools Appl., vol. 82, no. 18, pp. 27571–27591, 2023, doi: 10.1007/s11042-023-14484-0.

[22] J. Kozak, “Evolutionary computing techniques in data mining,” Stud. Comput. Intell., vol. 781, pp. 29 – 44, 2019, doi: 10.1007/978-3-319-93752-6_2.

[23] L. Zaourar et al., “Case Studies on the Impact and Challenges of Heterogeneous NUMA Architectures for HPC,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 14842 LNCS, pp. 251 – 265, 2024, doi: 10.1007/978-3-031-66146-4_17.

[24] N. Denoyelle, B. Goglin, A. Ilic, E. Jeannot, and L. Sousa, “Modeling non-uniform memory access on large compute nodes with the cache-aware roofline model,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 6, pp. 1374 – 1389, 2019, doi: 10.1109/TPDS.2018.2883056.

[25] H. Mitake, H. Yamada, and T. Nakajima, “Looking into the Peak Memory Consumption of Epoch-Based Reclamation in Scalable in-Memory Database Systems,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11707 LNCS, pp. 3 – 18, 2019, doi: 10.1007/978-3-030-27618-8_1.

[26] M. Zhang, L. Li, and X. Zheng, “RLART: An Adaptive Radix Tree Based on Deep Reinforcement Learning,” Commun. Comput. Inf. Sci., vol. 2214 CCIS, pp. 328 – 344, 2024, doi: 10.1007/978-981-97-8746-3_23.