Abstract:
Data-driven applications, in High-performance computing systems, involve a significant number of metadata operations, irregular access patterns, and small I/O requests. As a result, traditional parallel file systems became unable to efficiently handle such recent workloads, and data-driven applications suffered from significant I/O latency, lower throughput, and longer wait times. User-level file systems enhance the overall performance of High-performance computing clusters since the deployment overhead of such file systems is low compared with the application runtime. However, the use of a user-level file system is not without cost when used as part of a job or workflow. This is because it provides a new environment without data once it is deployed. Therefore, the input data must be copied from the parallel file system to the user-level file system before the computing job can start, and the output data must be copied in reverse from the user-level file system back to the parallel file system once the computing job is finished. These operations are referred to as stage-in and stage-out. In addition, a set of underlying technological difficulties are introduced either by High-performance computing platforms or application workloads. This paper introduces the most important limitations that could face user-level file systems when deployed in High-performance computing clusters.
Page(s):
487-491
DOI:
DOI not available
Published:
Journal: Science International, Volume: 34, Issue: 6, Year: 2022