Towards scalable online machine learning collaborations with OpenML

Joaquin Vanschoren

doi:10.14778/3484224.3484239

Abstract

Is massively collaborative machine learning possible? Can we share and organize our collective knowledge of machine learning to solve ever more challenging problems? In a way, yes: as a community, we are already very successful at developing high-quality open-source machine learning libraries, thanks to frictionless collaboration platforms for software development. However, code is only one aspect. The answer is much less clear when we also consider the data that goes into these algorithms and the exact models that are produced. A tremendous amount of work and experience goes into the collection, cleaning, and preprocessing of data and the design, evaluation, and finetuning of models, yet very little of this is shared and organized in a way so that others can easily build on it. Suppose one had a global platform for sharing machine learning datasets, models, and reproducible experiments in a frictionless way so that anybody could chip in at any time to share a good model, add or improve data, or suggest an idea. OpenML is an open-source initiative to create such a platform. It allows anyone to share datasets, machine learning pipelines, and full experiments, organizes all of it online with rich metadata, and enables anyone to reuse and build on them in novel and unexpected ways. All data is open and accessible through APIs, and it is readily integrated into popular machine learning tools to allow easy sharing of models and experiments. This openness also allows a budding ecosystem of automated processes to scale up machine learning further, such as discovering similar datasets, creating systematic benchmarks, or learning from all collected results how to build the best machine learning models and even automatically doing so for any new dataset. We welcome all of you to become a part of it.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards scalable online machine learning collaborations with OpenML

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment

Lead the way for us

Similar Papers

Enabling Self-Driving Networks with Machine Learning
Arthur Selle Jacobs ... Lisandro Zambenetti Granville
-
Arthur Selle Jacobs, et. al.Arthur Selle Jacobs ... Lisandro Zambenetti Granville
08 May 2023
08 May 2023

Perception without preconception: comparison between the human and machine learner in recognition of tissues from histological sections
Sanghita Barui ... K S Rajmohan
Scientific Reports | VOL. 12
Sanghita Barui, et. al.Sanghita Barui ... K S Rajmohan
30 Sep 2022
Scientific Reports | VOL. 12

State-of-the-Art Review of Machine Learning Models in Civil Engineering: Based on DAMIE Classification Tree
Jaehyun Kim ... Donghwi Jung
-
Jaehyun Kim, et. al.Jaehyun Kim ... Donghwi Jung
15 May 2023
15 May 2023

FireRisk: A Web Platform for next day fire forecasting
Stella Girtsou ... Alex Apostolakis
-
Stella Girtsou, et. al.Stella Girtsou ... Alex Apostolakis
15 May 2023
15 May 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards scalable online machine learning collaborations with OpenML

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment