Reproducibility is a basis of scientific integrity, yet it remains a significant challenge across disciplines in computational science. This reproducibility crisis is now being met with an Open Science movement, which has risen to prominence within the scientific community and academic libraries especially. To address the need for reproducible computational research and promote Open Science within the community, members of the Open Science and Data Collaborations Program at Carnegie Mellon University Libraries organized a single-day hackathon centered around reproducibility. Partnering with a faculty researcher in English and Digital Humanities, this event allowed several students an opportunity to interact with real research outputs, test the reproducibility of data analyses with code, and offer feedback for improvements. With Python code and data shared by the researcher in an open repository, we revealed that students could successfully reproduce most of the data visualizations, but they required completing some manual setup and modifications to address depreciated libraries to successfully rerun the code. During the event, we also investigated the option of using ChatGPT to debug and troubleshoot rerunning this code. By interacting with a ChatGPT API in the code, we found and addressed the same roadblocks and successfully reproduced the same figures as the participating students. Assessing a second option, we also collaborated with the researcher to publish a compute capsule in Code Ocean. This option presented an alternative to manual setup and modifications, an accessible option for more limited devices like tablets, and a simple solution for outside researchers to modify or build on existing research code.
Read full abstract