Abstract Background The growth of clinical NGS testing, together with new CAP checklist items stating that read-level (raw) NGS data must be retained for a minimum of 2 years, has prompted laboratories to reevaluate their NGS data storage solutions. In this context, cloud-based storage has several advantages, such as low per-GB prices, scalability, and minimal fixed costs; nonetheless, several perceived disadvantages, including increased complexity and security/regulatory considerations, have hindered adoption. Furthermore, despite the ostensibly simple usage-based pricing plans, practical cost analysis of cloud storage for NGS data storage is not straightforward. Methods We developed an easy-to-use tool designed specifically for cost and usage estimation for laboratories performing clinical NGS testing (https://ngscosts.info). Our tool enables quick exploration of dozens of storage options across three major cloud providers and provides complex cost and usage forecasts over 10- to 20-year timeframes. Parameters include current test volumes (and customizable file sizes), expected test volume growth rate, data compression techniques, data retention times across storage tiers, and case reaccess rates. Outputs include an easy-to-visualize chart of total data stored, yearly and lifetime costs, and a “cost per test” estimate. Results The per-GB storage prices among major cloud providers have dropped ~6 to 7× over the past 10 years; in addition, there currently exists an up to 20× price difference between storage tiers. Lower prices are typically offset by either increased access/transfer costs or slower retrieval rates. Two factors were found to markedly affect the average cost per test: (1) total file size, including the use of NGS-specific file compression techniques, and (2) storage tiers and rapid transfer to “cold” or archival storage. In contrast, some factors were not found to affect total costs, including costs associated with reaccess from archival storage tiers, and discounts associated with increased volumes. When data from NGS testing were stored in “hot” storage for 1 year, then archived for 9 years, typical costs per exome were ~$3/test and genomes ranged from $40/test to $66/test, depending on compression algorithms used. Overall, reaccess of data from cold storage added 2% or less to the total cost per test (with an estimated reaccess rate of 10% of cases per year). Conclusions Steady declines in cloud storage pricing, as well as new options for storage and retrieval, make storing clinical NGS data on the cloud economical and friendly to laboratory workflows. Taken together, laboratories should not be concerned about reaccess costs but rather should focus on data compression techniques and archival storage tiers for long-term data. Our web-based tool makes it possible to explore and compare cloud storage solutions and provide forecasts specifically for clinical NGS laboratories.
Read full abstract