Abstract

Although the phrase Big Data Analytics is relatively new, the practice of leveraging large data sources is not. Outcomes researchers are familiar with secondary data analyses of diverse data sources that include administrative claims, electronic health records, and clinical registries.1 Among the most prominent examples is the National Inpatient Sample (NIS). The NIS is constructed by sampling nationwide discharge records from acute care hospitals. It is part of the Healthcare Cost and Utilization Project and is the largest publicly available all-payer inpatient healthcare database in the United States, including annual use and cost data on >7 million hospital stays (see https://www.hcup-us.ahrq.gov/nisoverview.jsp). The NIS is available from the Agency for Healthcare Research and Quality at a minimal cost and supported with web-based tutorials, software, and user groups. Hundreds of articles using the NIS are published each year. I have used it in the past and suspect many of you have as well. Yet the NIS, like most other secondary data sources, must be used with caution. In this issue, we are publishing an important Perspective focused on examples of potential misuse of the NIS. In this provocative piece, Khera and Krumholz2 highlight 4 types of errors in published NIS studies: (1) not accounting for its complex sampling methodology; (2) ignoring the fact that it is limited to hospitalization data only; (3) incorrectly applying statewide and provider-level inferences to its results; and (4) using its diagnosis and procedure codes without sufficient validation. This article emphasizes the critical role of journals for …

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call