Single-molecule experiments offer a unique means to probe molecular properties of individual molecules-yet they rest upon the successful control of background noise and irrelevant signals. In single-molecule transport studies, large amounts of data that probe a wide range of physical and chemical behaviors are often generated. However, due to the stochasticity of these experiments, a substantial fraction of the data may consist of blank traces where no molecular signal is evident. One-class (OC) classification is a machine learning technique to identify a specific class in a data set that potentially consists of a wide variety of classes. Here, we examine the utility of two different types of OC classification models on four diverse data sets from three different laboratories. Two of these data sets were measured at cryogenic temperatures and two at room temperature. By training the models solely on traces from a blank experiment, we demonstrate the efficacy of OC classification as a powerful and reliable method for filtering out blank traces from a molecular experiment in all four data sets. On a labeled 4,4'-bipyridine data set measured at 4.2 K, we achieve an accuracy of 96.9 ± 0.3 and an area under the receiver operating characteristic curve of 99.5 ± 0.3 as validated over a fivefold cross-validation. Given the wide range of physical and chemical properties that can be probed in single-molecule experiments, the successful application of OC classification to filter out blank traces is a major step forward in our ability to understand and manipulate molecular properties.
Read full abstract