Abstract

Caenorhabditis elegans is a representative organism whose DNA structure has been fully elucidated. It has been used as a model organism for various analyses, including genetic functional analysis, individual behavioral analysis, and group behavioral analysis. Recently, it has also been studied as an important bioindicator of water pollution. In previous studies, traditional machine learning methods, such as the Hidden Markov Model (HMM), were used to determine water pollution and identify pollutants based on the differences in the swimming behavior of C. elegans before and after exposure to chemicals. However, these traditional machine learning models have low accuracy and a relatively high false-negative rate. This study proposes a method for detecting water pollution and identifying the types of pollutants using the Long Short-Term Memory (LSTM) model, a deep learning model suitable for time-series data analysis. The swimming activities of C. elegans in each of the image frames are characterized by the Branch Length Similarity (BLS) entropy profile. These BLS entropy profiles are converted into input vectors through additional preprocessing using two clustering methods. We conduct experiments using formaldehyde and benzene at 0.1 mg/L each, with observation time intervals varying from 30 to 180 s. The performance of the proposed method is compared with that of the previously proposed HMM approach and variants of LSTM models, such as Gated Recurrent Unit (GRU) and Bidirectional LSTM (BiLSTM).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call