Abstract

The selection of event candidates by machine learning algorithms has become an important analysis tool. Data mining, however, goes beyond the simple training and application of a learning algorithm. It also incorporates finding a good representation of data in fewer dimensions without losing relevant information, as well as a thorough validation of the results throughout the entire analysis. A data mining-based event selection chain has been developed for the measurement of the atmospheric νμ spectrum with IceCube in the 59-string configuration. It yielded a high statistics and high purity sample (99.59 ± 0.37%) of νμ , while allowing only 1.0 × 10−4 % of the incoming background muons to pass. In this paper the setup of the analysis chain is presented and the results are discussed in the context of atmospheric νμ analyses.

Highlights

  • A data mining-based event selection chain has been developed for the measurement of the atmospheric ν spectrum with IceCube in the 59-string configuration

  • In this paper the setup of the analysis chain is presented and the results are discussed in the context of atmospheric ν analyses

  • In the context of machine learning and data mining the techniques used by the IceCube neutrino telescope [1] are interesting for several reasons

Read more

Summary

Introduction

In the context of machine learning and data mining the techniques used by the IceCube neutrino telescope [1] are interesting for several reasons. In total 120,000 simulated ν events and 4.96 × 106 simulated background events were available for verifying the analysis chain This corresponds to a detector lifetime of ≈ 15 days. The analysis chain used in atmospheric ν analyses for the IceCube detector in the 59-string configuration consists of three consecutive steps. The first one is the application of simple straight cuts, intended to reduce obvious background events and the required CPU resources These cuts were applied at the zenith angle ( Zenith > 88◦ and the estimated velocity of the lepton vLepton > 0.19 c). In a second step the input variables for the learning algorithm are selected in a partially automated selection procedure This selection procedure is carried out on a limited number of simulated events. All machine learning algorithms were used in the data mining environment RapidMiner [3]

Variable selection
Training and testing of a Random Forest
Overall performance of the event selection chain
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.