Abstract

The prediction of file access times is an important part for the modeling of supercomputer's storage systems. These models can be used to develop analysis tools which support the users to integrate efficient I/O behavior. In this paper, we analyze and predict the access times of a Lustre file system from the client perspective. Therefore, we measure file access times in various test series and developed different models for predicting access times. The evaluation shows that in models utilizing artificial neural networks the average prediciton error is about 30% smaller than in linear models. A phenomenon in the distribution of file access times is of particular interest: File accesses with identical parameters show several typical access times.The typical access times usually differ by orders of magnitude and can be explained with a different processing of the file accesses in the storage system - an alternative I/O path. We investigate a method to automatically determine the alternative I/O path and quantify the significance of knowledge about the internal processing. It is shown that the prediction error is improved significantly with this approach.

Highlights

  • Tools are demanded that help users of HPC-facilities to implement efficient Input/Output (I/O) in their programs

  • We evaluate predictors of I/O performance using machine learning with artificial neural networks (ANNs)

  • Using a machine learning approach with artificial neural networks, we developed different models for file access time prediction

Read more

Summary

Introduction

Tools are demanded that help users of HPC-facilities to implement efficient Input/Output (I/O) in their programs. We use ANNs with different input information for the prediction of access times. Our results show that the relation of file access parameters to access time is not sufficiently represented by linear models. Our analysis suggests that the I/O path used by the storage system considerably influences the file access time. It becomes key for a good model of access. The final Section summarizes the paper and suggests future work

Related work
White-box modeling versus black-box modeling
Characteristics of the Data
Models
Evaluation
Test system
Benchmarking
Quartile
Analysis of error classes
Prediction of file accesses
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call