Abstract

Continuous developments in data science have brought forth an exponential increase in complexity of machine learning models. Additionally, data scientists have become ubiquitous in the private market and academic environments. All of these trends are on a steady rise, and are associated with an increase in power consumption and associated carbon footprint. The increasing carbon footprint of large-scale advanced data science has already received attention, but the latter trend has not. This work aims to estimate the contribution of the increasingly popular “common” data science to the global carbon footprint. To this end, the power consumption of several typical tasks in the aforementioned common data science tasks are measured and compared to: large-scale “advanced” data science, common computer-related tasks, and everyday non-computer related tasks. An automated data science project is also run on various hardware architectures. To assess its sustainability in terms of carbon emission, the measurements are converted to gCO2eq and an equivalent unit of “km driven by car”. Our main findings are: “common” data science consumes 2.57 more power than regular computer usage, but less than some common everyday power-consuming tasks such as lighting or heating; advanced data science consumes substantially more power than common data science, and can be either on par or vastly surpass common everyday power-consuming tasks, depending on the scale of the project. In addition to the reporting of these results, this work also aims to inspire researchers to include power usage and estimated carbon emission as a secondary result in their work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call