Abstract: "The explosion of applications in data-parallel systems and ever-growing high-efficiency needs for task processing and data analysis have made parallel systems under enormous memory pressure when dealing with large datasets. Out-of-memory errors and excessive garbage collection can seriously affect the system performance. Generally, for those data-flow tasks with intensive in-memory computing requirements, how to achieve efficient memory caching algorithms is a primary measure to make a trade-off between performance and memory overhead.By taking advantage of existing researches on the DAG-based task scheduling, we design a new caching algorithm for in-memory computing by exploiting the critical path information of DAG, called Non-critical path least reference count (NLC). The strategy is distinct from the existing ones in that it applies the global information of the critical path to the caching replacements rather than the task scheduling as most existing works do. Through empirical studies, we demonstrated that NLC can not only effectively enhance the parallel execution efficiency, but also reduce the number of evictions, improve the hit ratio, and memory utilization rate as well. Our comprehensive evaluations based on the selected benchmark graphs indicate that our strategy can not only fulfil the parallel system requirements but also reduce the costs by as much as 19%, compared with the most advanced LRC algorithm."
Authors: JingYa Lv (Shenzhen Institutes of Advanced Technology, China); Yang Wang (Shenzhen Institute of Advanced Technology, China)
Email: firstname.lastname@example.org, email@example.com