Scaling Data Intensive Physics Applications to 10k Cores on Non-dedicated Clusters with Lobster

Anna Woodard,Michael Hildreth,Douglas Thain,Nil Valls,Patrick Donnelly,Ben Tovar,Kenyi Hurtado Anampa,Kevin Lannon,Peter Ivie,Matthias Wolf,Paul Brenner,Charles Mueller

doi:10.1109/cluster.2015.53

Abstract

The high energy physics (HEP) community relies upon a global network of computing and data centers to analyze data produced by multiple experiments at the Large Hadron Collider (LHC). However, this global network does not satisfy all research needs. Ambitious researchers often wish to harness computing resources that are not integrated into the global network, including private clusters, commercial clouds, and other production grids. To enable these use cases, we have constructed Lobster, a system for deploying data intensive high throughput applications on non-dedicated clusters. This requires solving multiple problems related to non-dedicated resources, including work decomposition, software delivery, concurrency management, data access, data merging, and performance troubleshooting. With these techniques, we demonstrate Lobster running effectively on 10k cores, producing throughput at a level comparable with some of the largest dedicated clusters in the LHC infrastructure.

Full Text