Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Multimodal deep learning for identifying onshore natural gas facilities from GaoFen-2 imagery

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

With increasing demand for natural gas, the construction of natural gas extraction-related facilities has increased significantly. Accurate identification of these facilities is crucial for guiding spatial planning and evaluating environmental impacts. Existing research has primarily concentrated on offshore facilities, with limited attention to onshore facilities. This scarcity stems from identification challenges due to their dispersed distribution and complex environments. To address this gap, this study proposes a method combining a multimodal convolutional neural network (CNN) with object-based segmentation for onshore facility extraction. Experiments were conducted in northern Sichuan, China, with high-resolution Chinese satellite images, GF-2. Performance was compared between machine learning and CNN using sequentially cropped imageries. The proposed method achieved a precision of 59.97%, a recall of 94.87%, and an F1-score of 73.49%. The high recall indicates that most facilities were successfully detected, and the F1-score reflects the overall performance. These results suggest that the proposed method can effectively extract onshore facilities. Compared with machine learning and CNN using sequentially cropped imageries, the F1-score of the proposed method increased by 20.16% and 51.49%, respectively. The experimental results reveal that the proposed method can accurately identify onshore facilities, offering a scientific basis for assessing the environmental impact of greenhouse gases.

Similar Papers
  • Conference Article
  • Cite Count Icon 2
  • 10.4043/8411-ms
Lntroduction to the Troll Project
  • May 5, 1997
  • Peter J Wheeler

This paper sets the scene for six other papers to be presented at the joint Shell-Statoil session at the 1997 OTC Conference. The subject for this session is the Troll Phase I Gas ExportProject, for which A/S Norske Shell was the Development Operator, and Statoil is the Production Operator. The Troll gas development consists of an offshore platform standing in 303 metres of water connected by twomulti-phase pipelines to an onshore gas processing facility, which can export up to 100 MM Sm3 and 3500 m3 condensate per day. The paper briefly traces the history of the Troll field development from the discovery of the field in 1979, through to the establishment of the major design solutions. The first crucial development decision was to utilise on-shore gas processing, and the background to this decision is examined. In particular, the concept of utilising carbon steel multi-phase pipelines to transport the wet, untreated gas, is described. Anticipated benefits, relating to safety and environmental considerations, as compared to an integrated offshore facility, and the realisation or otherwise of these, are discussed. The offshore and onshore facilities are described, and the major design decisions that they represent are highlighted. In particular: the satisfying of the offshore power requirements from the national electrical grid and the use of variable speed electric motors for driving the gas export compressors (subjects covered more comprehensively in another paper); the provision of the necessary fibre optic link from the platform to the onshore central control room to enable the offshore facilities to be operated from onshore; the use of air to satisfy process cooling requirements and hot oil for process heating needs; the use of a High Integrity Pressure Protection System (HIPPS); the use of an semi-automated, packaged drilling rig; and the extensive use of composite materials. These features represent design decisions which, are at least to some degree, counter-intuitive, and are reviewed specifically with respect to their contribution to overall environmental and safety objectives. Introduction The Field. The Troll field is located about 85 km north west of Bergen on the west coast of Norway, (fig. 1) in the Norwegian Trench, at a water depth of 303.4m MSL. The field was discovered in 1979 by AlS Norske Shell, operatorfor Production Licence 054 covering Block 31/2. In 1983 this Block was declared commercial. Also in 1983, the adjoining blocks 31/3, 31/5 and 31/6 were awarded under Production Licence 085. Subsequent drilling ascertained that Troll extended into all four blocks (fig. 2). This initiated a process of unitisation of the two licences. By the end of 1986 the individual shares of the various companies with interests in the four blocks were recalculated for Troll as one single field. The shares of the Troll Partners were established as follows: (available in full paper)

  • Front Matter
  • Cite Count Icon 63
  • 10.1002/aps3.11371
Plants meet machines: Prospects in machine learning for plant biology
  • Jun 1, 2020
  • Applications in Plant Sciences
  • Pamela S Soltis + 3 more

Plants meet machines: Prospects in machine learning for plant biology

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/ucc-companion.2018.00048
Scalable Detection of Rural Schools in Africa Using Convolutional Neural Networks and Satellite Imagery
  • Dec 1, 2018
  • Mehrdad Yazdani + 7 more

Many countries typically lack sufficient civic data to assess where and what challenges communities face. High resolution satellite images can provide honest assessments of neighborhoods and communities to guide aid workers, policy makers, private sector, and philanthropists. Although humans are very good at detecting patterns, manually inspecting high resolution satellite imagery at scale can be costly and time consuming. Machine learning has the potential to scale this process significantly and automate the detection of regions of interest. Here we tackle the problem of identifying schools in northeastern rural Liberia as a case study for evaluating the value of high resolution satellite imagery and machine learning. In our case study we utilize unsupervised learning with pre-trained convolutional neural networks. Our results suggest that using machine learning with high resolution satellite images can reduce the search space, help find schools with high recall and aid appropriate and relevant resource allocations.

  • Research Article
  • Cite Count Icon 6
  • 10.2112/jcr-si114-085.1
Comparison of Machine and Deep Learning Methods for Mapping Sea Farms Using High-Resolution Satellite Image
  • Oct 6, 2021
  • Journal of Coastal Research
  • Yun-Jae Choung + 1 more

Choung, Y.-J. and Jung, D. 2021. Comparison of machine and deep learning methods for mapping sea farms using high-resolution satellite image. In: Lee, J.L.; Suh, K.-S.; Lee, B.; Shin, S., and Lee, J. (eds.), Crisis and Integrated Management for Coastal and Marine Safety. Journal of Coastal Research, Special Issue No. 114, pp. 420–423. Coconut Creek (Florida), ISSN 0749–0208. Previous research had shown that the supervised machine learning approach performed better than unsupervised machine learning for mapping sea farms using a high-resolution satellite image. The present work compares a support vector machine (SVM), which represents the supervised machine learning approach, and a deep neural network (DNN), which represents the deep learning approach, for mapping sea farms using KOMPSAT-3 satellite images acquired in the South Sea of South Korea. First, coastal maps were generated from the image source given by SVM and DNN. Next, the above-water and underwater farms were detected separately from both the maps based on the minimum and maximum thresholds. Finally, the detection accuracy of both the above-water and underwater farms from both coastal maps was assessed. Statistical results showed that deep learning (DNN) provided better performance than machine learning (SVM) for detecting above-water farms from the given high-resolution satellite image, while both DNN and SVM yielded the same performance for underwater farms. However, a few errors occurred in the detection because of the limitations of the pixel-based classification approaches. In future research, the deep learning algorithm combined with object-based classification, such as the convolutional neural network, can be used to detect sea farms from the given high-resolution image more accurately.

  • Conference Article
  • Cite Count Icon 1
  • 10.2523/iptc-14016-ms
Issues & Design Trends In Onshore Gas Reception Facilities
  • Dec 7, 2009
  • Jannes Jan Zomerman + 3 more

In pursuit of greater safety, lower environmental impact and lower capital expenditure, it is desired to reduce (or preferably eliminate) upstream offshore processing. Specifically for offshore platforms, the biggest capital expenditure savings can be achieved by either designing the offshore production facilities to be unmanned platforms or having subsea completions without a platform. In both cases all produced wellhead fluids will need to be transported untreated to shore in trunklines. A major contributor to HSE incidents in the upstream oil and gas industry is the logistics of bringing people to and from the platforms by boat or helicopter transport. Being able to avoid the necessity of this people transport will greatly improve the overall safety performance of the gas production. Furthermore, zero-effluent discharge offshore can be achieved by sending all of the produced fluids to the onshore plant for processing and disposal, as an alternative to the standard offshore processing and reinjection. A further trend in recent developments is that of poorer-quality feedstock gas. Amongst others gas is becoming more sour, containing up to 40% H2S and up to 70% CO2. This, coupled with the elimination of offshore processing facilities requires the injection of chemicals to combat trunkline corrosion and formation of hydrates. The higher quantities of acid gases, free water (condensed and formation water), injected chemicals and increased corrosion products all have to be handled by the onshore gas-reception facilities that have to produce acceptable quality gas for the downstream facilities. The onshore facilities also have to be robust for transient conditions like receiving significant slugs of production water during pigging and start up. As the gas and condensate specifications are getting more and more stringent stable operation of the units downstream the slugcatcher is essential to ensure on specification products. To realize this the downstream condensate stabiliser train needs a continuous feed of condensate even when water slugs enter the slugcatcher. Furthermore, potential water breakthrough to the condensate stabiliser train brings challenges to the design of the stabiliser column. These trends require the onshore gas-reception facilities to do more, and in more difficult circumstances. This in turn is leading to more complex gas-receiving facilities and an increase in the unit's cost. As the experience with the onshore gas-receiving facilities without offshore platform is still building up, it is essential that sufficient time and effort is spent in the design phase to identify the possible risks and challenges.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 24
  • 10.1117/1.jrs.12.042804
Segmentation model based on convolutional neural networks for extracting vegetation from Gaofen-2 images
  • Aug 2, 2018
  • Journal of Applied Remote Sensing
  • Chengming Zhang + 2 more

Convolutional neural network (CNN) models achieve state-of-the-art performance for natural image semantic segmentation. An approach for extracting vegetation from Gaofen-2 (GF-2) remote sensing imagery based on the CNN model is presented. We constructed a convolutional encoder neural networks (CENN) consisting of two layers. The first layer has two sets of convolutional kernels for extracting the features of farmland and woodland, respectively. The second layer consists of two encoders that use nonlinear functions to encode the learned features and map the encoding results to the corresponding category number. In the training stage, samples of farmland, woodland, and other lands are categorically used to train the CENN. After training is accomplished, the CENN would acquire enough ability to accurately extract farmland and woodland from GF-2 imagery. The CENN was trained on 36 GF-2 images and tested on three other GF-2 images. We compared the proposed method to a deep belief network, a fully convolutional network, and a DeepLab model using the same images. The experiments demonstrate that the proposed approach improves upon the accuracy of existing approaches. The average precision, recall, and kappa coefficient of the proposed approach were 0.91, 0.87, and 0.86, respectively. Thus, the proposed approach is proven to effectively extract vegetation from GF-2 imagery.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1155/2021/3801675
Classification of Electrocardiogram of Congenital Heart Disease Patients by Neural Network Algorithms
  • Aug 31, 2021
  • Scientific Programming
  • Yongjie Yuan + 3 more

The study intended to explore the effect of different neural network algorithms in the electrocardiogram (ECG) classification of patients with congenital heart disease (CHD). Based on the single convolutional neural network (CNN) ECG algorithm and the recurrent neural network (RNN) ECG algorithm, a multimodal neural network (MNN) ECG algorithm was constructed utilizing the MIT-BIH database as training set and test set. Furthermore, the MNN ECG algorithm was optimized to establish an improved MNN (IMNN) algorithm, which was applied to the diagnosis of CHD patients. The CHD patients admitted between August 2016 and August 2019 were selected for analysis to compare the classification effect and accuracy rate of IMNN, MNN, CNN ECG, and RNN ECG algorithms. It was found that the RNN ECG algorithm had higher classification sensitivity and true positive rate in terms of normal or bundle (NB) branch block beat, supraventricular abnormal (SA) rhythm, abnormal ventricular (AV) beat, and fusion beat (FB) than the CNN ECG algorithm ( P < 0.05 ), and the classification sensitivity and true positive rate of IMNN algorithm in the four aspects were significantly higher than those of MNN algorithm ( P < 0.05 ). The classification accuracy of CNN ECG algorithm and RNN ECG algorithm was above 98%, while that of MNN algorithm and IMNN algorithm was better than that of CNN ECG algorithm and RNN ECG algorithm, and the accuracy rate can reach 98.5% or more. Moreover, the accuracy rate of the IMNN algorithm can reach more than 98%. In conclusion, IMNN not only has a good classification ability in the simulated environment but also performs well in the actual environment, which is worthy of clinical promotion.

  • Conference Article
  • 10.2118/161194-ms
100MBD Interface and Shutdown Management
  • Nov 11, 2012
  • Hussain Ahmed Binthabet + 2 more

ADMA-OPCO's approach, 100MBD Interface and Shutdown Management In 2006, ADMA-OPCO embarked on a programme to increase the oil production from one of its offshore fields by 100 MBD. This involved a portfolio of 8 projects consisting of, on both offshore and onshore facilities and a drilling programme covering 40 wells. In addition to oil, ADMA OPCO also manages the offshore gas network and all offshore gas irrespective of source flows through its facilities. Therefore, any shutdown of gas facilities could impact gas and oil production from other offshore companies operating in the waters of the UAE, which has the potential to impact a significant portion of the offshore oil production in the UAE. This paper will outline how ADMA OPCO managed the interfaces between projects, existing facilities and sister companies to both maximise oil, gas production facilities availability, and minimise hydrocarbon lost opportunities during shutdown and interventions whilst delivering the 100MBD Project portfolio. The main areas of discussion will be Organisational structure – ADMA OPCO set up a new organisational division, which included new focused teams dedicated to management of projects; minimising losses; programme & interface management. Case study – the paper will detail how a specific event was managed involving several projects interfacing with offshore and onshore facilities with the potential to impact oil production and how total shutdowns were eliminated and oil losses were reduced by over 75% The success of this initiative resulted from focusing on Identifying and quantifying potential shutdown reduction opportunities. Identifying opportunities to minimise plant preparation and return to service times. Optimising plant configuration to maximise production opportunities during intervention. Identifying any works or modification required to accommodate identified production opportunities Application of methodology This methodology is applicable to any Brownfield intervention covering one or more work fronts or projects, and has a demonstrated successful outcome

  • Research Article
  • Cite Count Icon 3
  • 10.17485/ijst/v17i45.2728
Leveraging Machine and Deep Learning Models for Load Balancing Strategies in Cloud Computing
  • Dec 14, 2024
  • Indian Journal Of Science And Technology
  • C Thilagavathy

Objectives: To evaluate the efficiency of task prediction and resource allocation for load balancing (LB) in the cloud environment using the combined approach like random Forest(RF) for task prediction and Particle Swarm optimization for optimization and Convolutional Neural Networks (PSO-CNN) for resource prediction and allocation. Methods: The ensemble approach in the present study uses Random Forest (RF), a machine learning (ML) model for task prediction and Particle Swarm Optimization (PSO+CNN), a bio-inspired algorithm and Deep Learning (DL) model for optimization and resource allocation. The study employs PSO techniques to optimize CNN in order to address the investigation of algorithmic optimization in DL. The results show that the suggested model outperforms the other models like CNN-LSTM(Long Short-term memory), CNN-GRU(Gated Recurrent Unit), and PSO –SVM(Support Vector Machine) to increase the performance and efficacy of the cloud systems. The experiment is implemented using Python and assessed using Google Cluster dataset that is accessible to the public. Findings: The use of ML and DL techniques are found to be more efficient in cloud infrastructure than the conventional methods. The study examines the performance of the RF, PSO and CNN and the hybrid RF-PSO-CNN models. The accuracy, precision, and F1. Score metrics were used to assess the performance of the classification models. The recommended model RF-PSO-CNN outperforms them with an accuracy of 90% than the contrasted methods like CNN-LSTM, CNN- GRU and PSO-SVM. As a result, both the classification assessment metrics and resource consumption show that the proposed model performs effectively. Novelty: The novel ensemble approach suggests the combined RF-PSO-CNN for LB in cloud Computing. The task predicted by RF is assigned to the resource chosen by PSO and CNN, thereby improving the efficiency of task prediction and resource allocation. Most of the research uses any two ML or DL methods for either predicting the tasks to be scheduled or which resource to allocate. The study uses a combination of the ML (RF) method, bio-inspired algorithm (PSO) and a DL (CNN) model for both task and resource prediction concurrently and it examines the effectiveness of LB in the cloud context. Keywords: Load Balancing (LB), Task scheduling, Resource allocation, Random Forest (RF), Convolutional Neural Networks (CNN), Particle Swarm Optimization (PSO)

  • Conference Article
  • Cite Count Icon 1
  • 10.12783/asc36/35816
DEEP LEARNING FRAMEWORK FOR WOVEN COMPOSITE ANALYSIS
  • Sep 20, 2021
  • Haotian Feng + 2 more

paper, we focus on exploring the relationship between weave patterns and their mechanical properties in woven fiber composites through Machine Learning. Specifically, we explore the interactions between woven architectures and in-plane stiffness properties through Deep Convolutional Neural Network (DCNN) and Generative Adversarial Network (GAN). Our research is important for exploring how woven composite’s pattern is related to its mechanical properties and accelerating woven composite design as well as optimization. We focus on two tasks: (1) Stiffness prediction: Predicting in-plane stiffness properties for given weave patterns. Our DCNN extracts high-level features through several convolutional and fully connected layers to determine the final predictions. (2) Weave pattern prediction: Predicting weave patterns for target stiffness properties, which can be treated as the reverse task of the first one. Due to many-to-one mapping between weave patterns and the composite properties, we utilize a Decoder Neural Network as our baseline model and compare its performance with GAN and Genetic Algorithm. We represent the weave patterns as 2D checkerboard models and use finite element analysis (FEA) to determine in-plane stiffness properties, which serve as input data for our ML framework. We show that: (1) for stiffness prediction, DCNN can predict stiffness values for a given weave pattern with relatively high accuracy (above 93%); (2) for weave pattern prediction, the GAN model gives the best prediction accuracy (above 92%) while Decoder Neural Network has the best time efficiency. HAOTIAN FENG

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.neucom.2020.04.003
A convolutional fuzzy min-max neural network
  • May 17, 2020
  • Neurocomputing
  • Trupti R Chavan + 1 more

A convolutional fuzzy min-max neural network

  • Research Article
  • 10.1002/prep.70058
Convolutional and Graph Neural Network Framework for Predicting Critical Impact Velocity in Heterogeneous PBX‐9501
  • Oct 25, 2025
  • Propellants, Explosives, Pyrotechnics
  • Roberto Perera + 4 more

Heterogeneous energetic materials (HEM) can involve structural defects such as randomly distributed pores of varying size and shape. The unique arrangements of these defects cause initiation metrics such as pressure, temperature, and particle velocity to vary on a sample‐to‐sample basis. Current methods for predicting initiation rely on experiments and computational models. However, accounting for each possible pore configuration requires an extensive number of experiments and computational simulations, making them unfeasible for this problem. Machine learning (ML) offers an attractive approach to overcome these challenges. Towards this goal, this work introduces an ML framework involving a convolutional neural network (CNN) and a graph neural network (GNN) for predicting critical velocities in PBX‐9501 samples with multiple pores of varying quantity, size, and spatial distribution. The performance of both models was evaluated across two types of pore arrangements: Cartesian grids and rotated configurations. The comparative evaluation showed that the GNN outperformed the CNN in Cartesian grid pore configurations, achieving a lower average error of compared to the CNN of . Conversely, for rotated pore arrangements, the CNN achieved better accuracy of than the GNN of . Despite these differences, both models consistently achieved average prediction errors below , demonstrating strong overall performance across different pore configurations. Ultimately, this work advances the development of ML‐driven models capable of rapidly and accurately predicting how complex pore structures influence shock sensitivity in HEM(s).

  • Conference Article
  • Cite Count Icon 152
  • 10.1190/segam2018-2997085.1
Seismic facies classification using different deep convolutional neural networks
  • Aug 27, 2018
  • Tao Zhao

Convolutional neural networks (CNNs) is a type of supervised learning technique that can be directly applied to amplitude data for seismic data classification. The high flexibility in CNN architecture enables researchers to design different models for specific problems. In this study, I introduce an encoder-decoder CNN model for seismic facies classification, which classifies all samples in a seismic line simultaneously and provides superior seismic facies quality comparing to the traditional patch-based CNN methods. I compare the encoder-decoder model with a traditional patch-based model to conclude the usability of both CNN architectures. Presentation Date: Wednesday, October 17, 2018 Start Time: 8:30:00 AM Location: 204B (Anaheim Convention Center) Presentation Type: Oral

  • Preprint Article
  • 10.5194/egusphere-egu25-8547
Machine Learning techniques for the detection of geomorphological features in nearshore environments
  • Mar 18, 2025
  • Angelo Sozio + 6 more

Marine geophysical surveys provide crucial data and information for monitoring purposes and engineering application support on coastal and marine environments. Habitats associated to these specific natural contexts represent highly sensitive ecosystems that have been constantly threatened by human activities over the past few decades. Indeed, as stated by the European Commission, the 79% of the European coastal seabed is disturbed due to bottom trawling. Moreover, due to the ever-increasing demand of food and resources from the sea, issues as pollution, biodiversity loss, seabed damage, the spread of non-indigenous species, and similar phenomena are ever more serious. For this reason, the Marine Strategy Framework Directive (MSFD) were defined in 2008 by the European Commission to protect and keep safe its coasts, seas, and the ocean, ensuring their sustainable use. To this aim, marine geophysical techniques provide valuable tools for the assessment of biocenosis health status and distribution on a large scale. On the other hand, also engineering and industrial applications, such as offshore renewable energy production, onshore facilities, pipe installations or harbour maintenance, require high-resolution bathymetrical and sea-floor data for safe and sustainable operations, only obtainable with geophysical surveys.Concerning the nearshore environment investigation, standard marine survey techniques used so far consist of methodologies exploiting the propagation of acoustic waves in the water column, i.e., Side Scan Sonar (SSS), Single and Multi-beam Echo Sounder (SBES/MBES) and Sub-bottom Profiler (SBP). Moreover, camera acquisitions and sub-marine stereo-photogrammetry are increasingly used for the analysis of seafloor morphology, although limited to optimal water conditions. Recently, thanks to the AI techniques improvements, Machine Learning (ML) techniques, coupled with GIS software, represent valuable tools for interpreting and mapping sub-merged morphological features on geophysical data using a multidisciplinary approach.In this context, our research proposes a Computer Vision implementation using Convolutional Neural Networks (CNNs) for the detection and classification of marine morphological features in nearshore sectors of the Italian coastal environment.  Two different CNNs algorithms were used for the automatic segmentation and classification considering one considering the most marine morphological features of the study area and recognizable on SSS orthomosaics. The latter were acquired in two coastal sites of the Apulia Region (Southern Italy): Torre Guaceto Beach (Brindisi), on the Adriatic coast, and Leporano beach (Taranto) on the Ionian seaside. The first CNN algorithm is U-Net while the second one is a Mask-RCNN-based algorithm, already used in previous works to detect Beah Litter items on the emerged section of a beach. The training datasets were suitably processed to make them available for both algorithms, which process data in a slightly different way. Moreover, the training dataset based on the nearshore environment of the Apulian coastal sector will make it possible to map seabeds with similar morphological characteristics. This multidisciplinary approach represents an early stage of a first and promising integration tool to the classical manual image screening of marine seafloor morphology on a large homogeneous seabed, characterizing most of the Mediterranean coasts. Further development will concern additional geophysical surveys that will increase the dataset for a higher detection accuracy.

  • Research Article
  • 10.3897/aca.8.e151406
The Use of Pretrained Convolutional Neural Networks in Recognizing Phytoplankton Species. Cases from a marine, a brakishwater and a freshwater site.
  • May 28, 2025
  • ARPHA Conference Abstracts
  • Mauro Bastianini + 11 more

Introduction Phytoplankton are microscopic organisms that form the foundation of aquatic food webs. Accurate identification and classification of phytoplankton species are crucial for monitoring all aquatic ecosystems, from marine to freshwater, understanding ecological dynamics, and assessing environmental changes. Traditional methods of phytoplankton identification, which rely on manual microscopy, are time-consuming and require expert knowledge. Recent advancements in machine learning, particularly Convolutional Neural Networks (CNNs), offer promising solutions for automating this process. This abstract explores the application of pre-trained CNNs in recognizing phytoplankton species, highlighting their advantages, methodologies, and potential impacts. Methodology We present three approaches from a marine site, the Gulf of Venice site of the LTER-Italy network (DEIMS.ID https://deims.org/758087d7-231f-4f07-bd7e-6922e0c283fd), which includes the 'Acqua Alta' Oceanographic Tower (AAOT) (Fig. 1), the brackishwater site Utö Atmospheric and Marine Research Station (ResNet-18, located at 59°46.84’ N, 21°22.13’ E) https://en.ilmatieteenlaitos.fi/uto, and the freshwater site the IGB-LakeLab in Lake Stechlin NE Germany (DEIMS.ID https://deims.org/2223bc9c-12b2-49fe-af73-4299f553e054). Three different architectures of CNN were used: VGG16 for the Gulf of Venice, ResNet-18 for the Finnish station and a YOLOv11-cls for the German Lake Stechlin LakeLab station. These CNN models were pre-trained on the ImageNet dataset and subsequently fine-tuned with specific datasets for the respective geographic areas. These CNNs were chosen for their ability to autonomously extract features from images without external assistance, making them efficient, fast tools for analyzing large amounts of data and due to their specificity regarding the characteristics of the observational site. The process involves several steps: Data Collection and Preprocessing : several public datasets are available (Ciranni et al. 2024), where each image is annotated according to its class. Each model is structured to require input images in a specific format, so depending on the chosen model, it is necessary to preprocess the images accordingly. With an Imaging Flow Cytobot (IFCB, an in-situ automated submersible imaging flow cytometer that generates images of particles in-flow taken from the aquatic environment.), the produced images are of good quality (Fig. 2), and the main modification applied is resizing the images to fit the model requirements; Transfer Learning : Transfer learning allows the weights of a pre-trained neural network to be retained and updated (only if specified) for specific tasks. It has been demonstrated that using pre-trained models leads to significant results, reducing both training time and the amount of data required compared to an untrained model (Maracani et al. 2023); Training and Validation : The modified CNN is trained on the annotated phytoplankton images. Techniques such as data augmentation (to increment the number of images), dropout, and batch normalization are employed to enhance model performance and prevent overfitting. The model's accuracy is validated using a separate dataset; Evaluation Metrics : Performance metrics, including accuracy, precision, recall, and F1-score, are used to evaluate the model. Confusion matrices and receiver operating characteristic (ROC) curves provide additional insights into the model's classification capabilities. Data Collection and Preprocessing : several public datasets are available (Ciranni et al. 2024), where each image is annotated according to its class. Each model is structured to require input images in a specific format, so depending on the chosen model, it is necessary to preprocess the images accordingly. With an Imaging Flow Cytobot (IFCB, an in-situ automated submersible imaging flow cytometer that generates images of particles in-flow taken from the aquatic environment.), the produced images are of good quality (Fig. 2), and the main modification applied is resizing the images to fit the model requirements; Transfer Learning : Transfer learning allows the weights of a pre-trained neural network to be retained and updated (only if specified) for specific tasks. It has been demonstrated that using pre-trained models leads to significant results, reducing both training time and the amount of data required compared to an untrained model (Maracani et al. 2023); Training and Validation : The modified CNN is trained on the annotated phytoplankton images. Techniques such as data augmentation (to increment the number of images), dropout, and batch normalization are employed to enhance model performance and prevent overfitting. The model's accuracy is validated using a separate dataset; Evaluation Metrics : Performance metrics, including accuracy, precision, recall, and F1-score, are used to evaluate the model. Confusion matrices and receiver operating characteristic (ROC) curves provide additional insights into the model's classification capabilities. Results Studies have demonstrated that pre-trained CNNs can achieve high accuracy in phytoplankton classification. In our case, models like ResNet and VGG have shown classification accuracies exceeding 80% on diverse phytoplankton datasets (Fig. 3, Kraft et al. 2022). These models effectively distinguish between species with subtle morphological differences, which are often challenging for human experts. Discussion The use of pre-trained CNNs in phytoplankton recognition offers several advantages: Efficiency : Automated classification significantly reduces the time and effort required for phytoplankton identification compared to manual methods. Scalability : CNNs can handle large volumes of image data, making them suitable for Long Term Ecological Research. Consistency : Machine learning models provide consistent and objective classifications, minimizing human error and variability. Efficiency : Automated classification significantly reduces the time and effort required for phytoplankton identification compared to manual methods. Scalability : CNNs can handle large volumes of image data, making them suitable for Long Term Ecological Research. Consistency : Machine learning models provide consistent and objective classifications, minimizing human error and variability. However, challenges remain. The automatic taxonomic identification level is still not as detailed as that of human expertise. The quality and diversity of training data are critical for model performance. Inadequate or biased datasets can lead to poor generalization. Additionally, the interpretability of CNNs is limited, making it difficult to understand the decision-making process fully. Conclusion Pretrained CNNs represent a powerful tool and a pipeline for phytoplankton species recognition, offering significant improvements in efficiency, scalability, and consistency over traditional methods. Continued advancements in machine learning and the availability of high-quality datasets will further enhance the capabilities of these models. Future research should focus on addressing current limitations, such as data quality and model interpretability, to fully realize the potential of CNNs in marine science. In this work, we will present the results as discussed to demonstrate possible workflows to fully realize the potential of CNNs in marine science and potentially contribute to the Standard Observations (SOs) addressing current limitations. We will also bring a workflow proposal to manage and perform actions related to harmonization, interoperability, quality control and sharing of the data obtained througth the CNNs recognitions following the directives proposed by Torstensson (2025).

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant