In this paper, we investigate the problem of sequencing data requests for linear devices, such as magnetic tapes, to minimize response time in cold-storage solutions for archival data. Tapes are the technology of choice for long-term storage due to their reliability, low costs, security, and significant energy savings. However, physical limitations on tape pose challenges to policy implementation, which must be scalable on low-power hardware. We provide a theoretical and numerical analysis of existing policies and introduce new ones, identifying cases where each is applicable and evaluating their theoretical performance in terms of the number of requested files. In particular, we show that the standard first-in, first-out (FIFO) policy can be arbitrarily inefficient and investigate novel constant-ratio approximations and polynomial-time procedures. If data on the frequency with which each file is accessed is available, we consider a dynamic programming procedure to minimize a stochastic variant of the problem, providing a constant-time approximation for arbitrary file requests. We also investigate a quality criterion based on makespan and explore online variants of the problem. Our numerical analysis, conducted on both synthetic and real-world data from an industry partner, offers insights into when each policy is most appropriate, identifying cases where the proposed algorithms significantly outperform traditional policies, like FIFO, in terms of average reading times. This study has managerial implications, as current data retrieval practices in data centers are often limited to traditional policies with unknown theoretical performance. Our methodological and numerical analysis provides evidence of the value in appropriately sequencing requests based on tape structure, guiding algorithm choice and highlighting underlying trade-offs in response times.
Read full abstract