Here, we introduce the open-source software framework wepy (https://github.com/ADicksonLab/wepy) which is a toolkit for running and analyzing weighted ensemble (WE) simulations. The wepy toolkit is in pure Python and as such is highly portable and extensible, making it an excellent platform to develop and use new WE resampling algorithms such as WExplore, REVO, and others while leveraging the entire Python ecosystem. In addition, wepy simplifies WE-specific analyses by defining out-of-core tree-like data structures using the cross-platform HDF5 file format. In this paper, we discuss the motivations and challenges for simulating rare events in biomolecular systems. As has previously been shown, high-dimensional WE resampling algorithms such as WExplore and REVO have been successful at these tasks, especially for rare events that are difficult to describe by one or two collective variables. We explain in detail how wepy facilitates implementation of these algorithms, as well as aids in analyzing the unique structure of WE simulation results. To explain how wepy and WE work in general, we describe the mathematical formalism of WE, an overview of the architecture of wepy, and provide code examples of how to construct, run, and analyze simulation results for a protein–ligand system (T4 Lysozyme in an implicit solvent). This paper is written with a variety of readers in mind, including (1) those curious about how to leverage WE rare-event simulations for their domain, (2) current WE users who want to begin using new high-dimensional resamplers such as WExplore and REVO, and (3) expert users who would like to prototype or implement their own algorithms that can be easily adopted by others.
Read full abstract