Improving Slow-Moving Object Detection in Complex Environments Using a Feature Pooling Enhanced Encoder-Decoder Model

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The ability to detect moving objects is of great importance in a wide range of visual surveillance systems, playing a vital role in maintaining security and ensuring effective monitoring. However, the primary aim of such systems is to detect objects in motion and tackle real-world challenges effectively. Despite the existence of numerous methods, there remains room for improvement, particularly in slowly moving video sequences and unfamiliar video environments. In videos where slow-moving objects are confined to a small area, it can cause many traditional methods to fail to detect the entire object. However, an effective solution is the spatial-temporal framework. Additionally, the selection of temporal, spatial, and fusion algorithms is crucial for effectively detecting slow-moving objects. This article presents a notable effort to address the detection of slowly moving objects in challenging videos by leveraging an encoder-decoder architecture incorporating a modified VGG-16 model with a feature pooling framework. Several novel aspects characterize the proposed algorithm: it utilizes a pre-trained modified VGG-16 network as the encoder, employing transfer learning to enhance model efficacy. The encoder is designed with a reduced number of layers and incorporates skip connections to extract essential fine and coarse-scale features crucial for local change detection. The feature pooling framework (FPF) utilizes a combination of different layers including max pooling, convolutional, and numerous atrous convolutional with varying rates of sampling. This integration enables the preservation of features at different scales with various dimensions, ensuring their representa tion across a wide range of scales. The decoder network comprises stacked convolutional layers effectively mapping features to image space. The performance of the developed technique is assessed in comparison to various existing methods, including those by CMRM, Hybrid algorithm, Fast valley, EPMCB, and MODCVS, showcasing its effectiveness through both subjective and objective analyses. It demonstrates superior performance, with an average F-measure (AF) value of 98.86% and a lower average misclassification error (AMCE) value of 0.85. Furthermore, the algorithm’s effectiveness is validated on Imperceptible Video Configuration video setups, where it exhibits superior performance.

Similar Papers
  • Research Article
  • Cite Count Icon 3
  • 10.1007/s13201-025-02388-3
Tailoring innovative adsorbents from discarded weathered basalt waste by calcination and activated carbon impregnation for efficient Fe (III) and Zn (II) remediation
  • Mar 13, 2025
  • Applied Water Science
  • Ahmed M Zayed + 9 more

This study explores the potential of utilizing weathered basalt waste, discarded from basalt stone quarrying, as a resource for producing efficient adsorbents to remove Fe (III) and Zn (II) from aqueous and real wastewater. Raw weathered basalt (RWB), and its calcined derivatives at 750 °C for 3 h (CWB-750) and at 950 °C for 1 h with activated carbon impregnation (CWB/AC-950), were prepared and characterized. Characterization using XRD, FTIR, SEM, and surface area analyzer revealed that calcination improved porosity and surface area with some privilege for CWB/AC-950. CWB/AC-950 revealed remarkable removal efficiency for Fe (III) at a pH value of 5, achieving 98.30%, closely matching that of RWB (98.00%), and outperforming CWB-750 (96.20%). In contrast, RWB exhibited the highest removal capacity for Zn (II) at a pH value of 6, with an efficiency of 55%, surpassing both CWB-750 and CWB/AC-950, which achieved approximately 36%. For both contaminants, Pseudo-2nd-order equation (R2 > 0.98) provided a superior fit, showcasing favorable sorption process by all the addressed materials. The Fe(III) sorption data for all the investigated materials were better described by the Freundlich (FL) model compared to the Langmuir (LM) model. Similarly, the Zn(II) sorption data for the calcined derivatives (CWB-750 and CWB/AC-950) were well-explained by the FL model. These findings are supported by the very high determination coefficients (R2 > 0.96) and significantly lower average relative error (ARE) values (8.66 and 13.69) compared to those obtained from the LM model (55.99 and 189.25, orderly). In contrast, for RWB, despite the very high R2 values (> 0.98) for both models, neither adequately captured the Zn(II) sorption behavior, as evidenced by the exceptionally high ARE values (52.67 and 161.19 for LM and FL, respectively). These findings are supported by the very high determination coefficients (R2 > 0.96) and significantly lower average relative error (ARE) values compared to those obtained from the LM model. In contrast, for RWB, despite the very high R2 values (> 0.98) for both models, neither adequately captured the Zn(II) sorption behavior, as evidenced by the exceptionally high ARE values (52.67 and 161.19 for LM and FL, respectively). The remediation mechanism of both Fe (III) and Zn (II) by all adsorbents was not exclusively governed by inter-particle diffusion. Eventually, these findings highlight the sustainable potential of repurposing RWB waste and its calcined derivatives for water remediation applications.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.1155/2016/2041467
Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography
  • Jan 1, 2016
  • Computational and Mathematical Methods in Medicine
  • İlhan Umut + 1 more

The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.

  • Research Article
  • Cite Count Icon 9
  • 10.1007/s004490050652
Modeling of phenol degradation system using artificial neural networks
  • Jan 1, 1999
  • Bioprocess Engineering
  • S M Balan + 4 more

Pseudomonas pictorum (NICM-2077) an effective strain used in the biodegradation of phenol was grown on various nutrient compounds which protect the microbes while confronting shock loads of concentrated toxic pollutants during waste water treatment. In the present study the effect of glucose, yeast extract, (NH4)2SO4 and NaCl on phenol degradation has been investigated and a Artificial Neural Network (ANN) Model has been developed to predict degradation. Also the learning, recall and generalization characteristics of neural networks has been studied using phenol degradation system data. The network model was then compared with a Multiple Regression Analysis model (MRA) arrived from the same training data. Further, these two models were used to predict the percentage degradation of phenol for a blind test data. Though both the models perform equally well ANN is found to be better than MRA due to its slightly higher coefficient of correlation, lower RMS error value and lower average absolute error value during prediction.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.47839/ijc.22.1.2878
Simulated Annealing – 2 Opt Algorithm for Solving Traveling Salesman Problem
  • Mar 29, 2023
  • International Journal of Computing
  • P H Gunawan + 1 more

The purpose of this article is to elaborate performance of the hybrid model of Simulated Annealing (SA) and 2 Opt algorithm for solving the traveling salesman problem (TSP). The SA algorithm used in this article is based on the outer and inner loop SA algorithm. The hybrid algorithm has promising results in solving small and medium-scale symmetric traveling salesman problem benchmark tests taken from the TSPLIB reference. Results of the optimal solution and standard deviation indicate that the hybrid algorithm shows good performance in terms of reliability and stability in finding the optimal solution from the TSP benchmark case. Values of average error and standard deviation for all simulations in the medium scale are 0.0267 and 644.12, respectively. Moreover, in some cases namely KroB100, Pr107, and Pr144, the hybrid algorithm finds a better solution compared with the best-known solution mentioned in the reference. Further, the hybrid algorithm is 1.207 – 5.692 times faster than the pure outer and inner loop-based SA algorithm. Additionally, the results show that the hybrid algorithm outperforms other hybrid algorithms such as SA – nearest neighbor (NN) and NN – 2 Opt.

  • Research Article
  • 10.1007/s42452-025-07433-z
An enhanced deep learning-based feature extraction framework for moving object detection
  • Jul 14, 2025
  • Discover Applied Sciences
  • Upasana Panigrahi + 4 more

Detecting changes is a crucial step in computer vision-based monitoring systems. However, the primary objective of these systems is to accurately identify moving objects, ensuring their applicability in diverse real-world scenarios. Various methods across the globe are developed by the researcher for change detection. However, most current methods require improvement in the challenging datasets. This article introduces an innovative Moving Object Detection Algorithm (MODA) for detecting moving objects for benchmark CD-Net 2014, WallFlower, Star, STERE, DUTS, NLPR, NJU2K, and SIP datasets. The designed approach utilizes an encoder-decoder model, where the encoder framework incorporates a modified ResNet-50 model with a transfer learning strategy that can retain subtle details effectively. The designed Multi-Scale Feature Pooling Framework (MSFP) guarantees the preservation of multi-scale and multi-dimensional features across different scales. The developed decoder architecture consists of stacked transposed convolutional layers tasked with translating features back into the image. To evaluate the efficacy of the designed scheme, analyses were carried out, comparing it with forty-two existing methods. The results obtained from the developed algorithm are validated through both subjective and objective assessments. It could be observed that the developed model outperforms forty-two existing techniques in terms of considered measures. The slow-moving object dataset achieved an average F-measure of 98.59% and an average misclassification error of 0.83. In the CD-Net 2014 dataset, the model achieved an average precision of 0.8886, an average recall of 0.8583, an average F-measure of 0.8500, and an average percentage of wrong classification error of 0.8200. Further, the P-Test and average intersection over Union also found out. Furthermore, similarity metrics are computed for the Star dataset, while the WallFlower dataset is evaluated based on average F-measure. Also, the developed approach provides better accuracy for the unseen video setup.

  • PDF Download Icon
  • Research Article
  • 10.19028/jtep.21.3.295-306
Validasi Spesifikasi Campuran Biodiesel-Solar Hasil Pengukuran Dengan Metoda Perhitungan Sederhana
  • Sep 1, 2007
  • Jurnal Keteknikan Pertanian
  • Soni S Wirawan + 3 more

Biodiesel is a fuel derived from vegetable oil or animal fats that can be used as an additive to or entirely replace conventional petroleum diesel fuel. In most cases, biodiesel is mixed with conventional diesel because of the higher cost of biodiesel, product availability and engine compatibility issues. In Indonesia, the decree No. 3675K/24/DJM/2006 regarding the quality and specification of diesel oil type Solsr 48 and Solar 51 has been issued this decree regulates the use of FAME (fatty acid methyl ester) up to the maximum of 10 percent of the volume of automotive diesel fuel with which it is to be blended. The cost to measure the properlles of fuel is expensive and time consuming, therefore it is important to develop a simple method to predict those blending properties. This paper presents the development of a simple calculation method for the validation of blend palm biodiesel-mineral diesel specification (density, viscosity, cetane number and lubricity) which has been measured in the authors previous study The result shows that the lubricity and viscosity shows a higher average error value (difference value between calculation and measurement result) of 1.66% and 1.35%, whereas density and cetane number shows lower average error values of 0.06 and 0.6%. The average error value less than 2% is still acceptable. Keywords: Biodiesel, blend, fuel properties, density. cetane, viscosity, lubricity Diterima; 20 Agustus 2007; Disetujui: 31 Agustus 2007

  • Research Article
  • Cite Count Icon 22
  • 10.1109/tim.2022.3181898
MAFusion: Multiscale Attention Network for Infrared and Visible Image Fusion
  • Jan 1, 2022
  • IEEE Transactions on Instrumentation and Measurement
  • Xiaoling Li + 3 more

The infrared and visible image fusion aims to generate one image with rich information by integrating thermal regions from the infrared image and texture details from the visible image, which is beneficial to facilitate the capacity of video surveillance and object detection in complex environments. Although there is great progress in image fusion algorithms, artifacts and inconsistencies are still challenging tasks. To alleviate these problems, a multi-scale attention network for infrared and visible image fusion (MAFusion) is proposed. The network consists of encoder, fusion strategy, and decoder. Specifically, the encoder is adopted to extract multi-scale features by feeding the source images. An attention-based model is then designed as the fusion strategy to integrate different features in the infrared and visible images. The attention-based model can highlight the thermal targets in the infrared image and maintain details in the visible image, so as to avoid the generation of artifacts. The decoder is based on multi-scale skip connection to incorporate low-level details with high-level semantics at different scales. The vital features of infrared and visible images can be fully preserved by the multi-scale skip connection network to restrict the introduction of inconsistencies. Furthermore, we develop a feature-preserving loss function to train the proposed network. Experimental results demonstrate that the proposed network delivers advantages and effectiveness compared with the state-of-the-art fusion methods in qualitative and quantitative assessments. Besides, we apply the fused image generated by MAFusion to crowd counting, which can effectively improve the crowd counting performance in low illumination conditions.

  • Research Article
  • 10.3389/fpls.2025.1711545
MDE-DETR: multi-domain enhanced feature fusion algorithm for bayberry detection and counting in complex orchards
  • Nov 27, 2025
  • Frontiers in Plant Science
  • Cheng Zhou + 4 more

IntroductionBayberry detection plays a crucial role in yield prediction. However, bayberry targets in complex orchard environments present significant detection challenges, including small volume, severe occlusion, and dense distribution, making traditional methods inadequate for practical applications.MethodsThis study proposes a Multi-Domain Enhanced DETR (MDE-DETR) detection algorithm based on multi-domain enhanced feature fusion. First, an Enhanced Feature Extraction Network (EFENet) backbone is constructed, which incorporates Multi-Path Feature Enhancement Module (MFEM) and reparameterized convolution techniques to enhance feature perception capabilities while reducing model parameters. Second, a Multi-Domain Feature Fusion Network (MDFFN) architecture is designed, integrating SPDConv spatial pixel rearrangement, Cross-Stage Multi-Kernel Block (CMKBlock), and dual-domain attention mechanisms to achieve multi-scale feature fusion and improve small target detection performance. Third, an Adaptive Deformable Sampling (ADSample) downsampling module is constructed, which dynamically adjusts sampling positions through learnable spatial offset prediction to enhance model robustness for occluded and dense targets.Results and discussionExperimental results demonstrate that on a self-constructed bayberry dataset, MDE-DETR achieves improvements of 3.8% and 5.1% in mAP50 and mAP50:95 respectively compared to the RT-DETR baseline model, reaching detection accuracies of 92.9% and 67.9%, while reducing parameters and memory usage by 25.76% and 25.14% respectively. Generalization experiments on VisDrone2019 (a small-target dataset) and TomatoPlantfactoryDataset (a dense occlusion dataset) datasets further validate the algorithm's effectiveness, providing an efficient and lightweight solution for small-target bayberry detection in complex environments.

  • Research Article
  • Cite Count Icon 25
  • 10.1021/acsomega.2c00536
Application of Artificial Intelligence Techniques for the Determination of Groundwater Level Using Spatio-Temporal Parameters.
  • Mar 21, 2022
  • ACS Omega
  • Amirhossein Najafabadipour + 2 more

Increasing the depth of mining leads to the location of the mine pit below the groundwater level. The entry of groundwater into the mining pit increases costs as well as reduces efficiency and the level of work safety. Prediction of the groundwater level is a useful tool for managing groundwater resources in the mining area. In this study, to predict the groundwater level, multilayer perceptron, cascade forward, radial basis function, and generalized regression neural network models were developed. Moreover, four optimization algorithms, including Bayesian regularization, Levenberg–Marquardt, resilient backpropagation, and scaled conjugate gradient, are used to improve the performance and prediction ability of the multilayer perception and cascade forward neural networks. More than 1377 data points including 12 spatial parameters divided into two categories of sediments and bedrock (longitude, latitude, hydraulic conductivity of sediments and bedrock, effective porosity of sediments and bedrock, the electrical resistivity of sediments and bedrock, depth of sediments, surface level, bedrock level, and fault), and besides, 6 temporal parameters are used (day, month, year, drainage, evaporation, and rainfall). Also, to determine the best models and combine them, 165 extra validation data points are used. After identifying the best models from the three candidate models with a lower average absolute relative error (AARE) value, the committee machine intelligence system (CMIS) model has been developed. The proposed CMIS model predicts groundwater level data with high accuracy with an AARE value of less than 0.11%. Sensitivity analysis indicates that the electrical resistivity of sediments had the highest effect on the groundwater level. Outliers’ estimation applying the Leverage approach suggested that only 2% of the data points could be doubtful. Eventually, the results of modeling and estimating groundwater level fluctuations with low error indicate the high accuracy of machine learning methods that can be a good alternative to numerical modeling methods such as MODFLOW.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/a7030444
A Novel Contrast Enhancement Technique on Palm Bone Images
  • Sep 5, 2014
  • Algorithms
  • Yung-Tsang Chang + 2 more

Contrast enhancement plays a fundamental role in image processing. Many histogram-based techniques are widely used for contrast enhancement of given images, due to their simple function and effectiveness. However, the conventional histogram equalization (HE) methods result in excessive contrast enhancement, which causes natural looking and satisfactory results for a variety of low contrast images. To solve such problems, a novel multi-histogram equalization technique is proposed to enhance the contrast of the palm bone X-ray radiographs in this paper. For images, the mean-variance analysis method is employed to partition the histogram of the original grey scale image into multiple sub-histograms. These histograms are independently equalized. By using this mean-variance partition method, a proposed multi-histogram equalization technique is employed to achieve the contrast enhancement of the palm bone X-ray radiographs. Experimental results show that the multi-histogram equalization technique achieves a lower average absolute mean brightness error (AMBE) value. The multi-histogram equalization technique simultaneously preserved the mean brightness and enhanced the local contrast of the original image.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.3390/su13147879
Unemployment Rates Forecasting with Grey-Based Models in the Post-COVID-19 Period: A Case Study from Vietnam
  • Jul 14, 2021
  • Sustainability
  • Phi-Hung Nguyen + 3 more

The Coronavirus (COVID-19) pandemic has had a significant impact on most countries’ social and economic perspectives worldwide. Unemployment has become a vital challenge for policymakers as a result of COVID-19′s negative impact. Because of the nonstationary and nonlinear nature of the dataset, researchers applied various time series models to forecast the unemployment rate. This study aims to ensure a better forecasting approach for predicting the unemployment rates with an uncertainty of insufficient knowledge and tiny data throughout Vietnam. The study proposes the Grey theory system-based GM (1,1), the Grey Verhulst Model (GVM), and the Autoregressive Integrated Moving Average (ARIMA) model that can more precisely predict unemployment rates. The model’s applications are shown using the Vietnamese unemployment rate at six different rural and urban areas with data sets from 2014–2019. The results indicate that the lower Mean Average Percentage Error (MAPE) values obtained with the GM (1,1) model at all regions for rural and urban areas (excluding Highlands Region in urban area) are extremely encouraging in comparison to other traditional methods. The accurate level of the ARIMA and GVM models follows the GM (1,1) model. The findings of this study show that the effects of the modeling assist policymakers in shaping future labor and economic policies. Furthermore, this study can contribute to the unemployment literature, providing future research directions in the unemployment problems.

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.dmpk.2021.100408
Utility of Göttingen minipigs for the prediction of human pharmacokinetic profiles after intravenous drug administration
  • Jun 9, 2021
  • Drug Metabolism and Pharmacokinetics
  • Ning Ding + 5 more

Utility of Göttingen minipigs for the prediction of human pharmacokinetic profiles after intravenous drug administration

  • Research Article
  • Cite Count Icon 36
  • 10.1016/j.infsof.2022.106847
Predicting the precise number of software defects: Are we there yet?
  • Jun 1, 2022
  • Information and Software Technology
  • Xiao Yu + 5 more

Predicting the precise number of software defects: Are we there yet?

  • Research Article
  • Cite Count Icon 63
  • 10.1016/j.msea.2015.09.055
Constitutive analysis of hot deformation behavior of a Ti6Al4V alloy using physical based model
  • Sep 18, 2015
  • Materials Science and Engineering: A
  • Paul M Souza + 4 more

Constitutive analysis of hot deformation behavior of a Ti6Al4V alloy using physical based model

  • Research Article
  • Cite Count Icon 5
  • 10.1088/2053-1591/ad425d
High-temperature mechanical properties of additively manufactured 420 stainless steel
  • May 1, 2024
  • Materials Research Express
  • Harveen Bongao + 6 more

Martensitic stainless steels are indispensable alloys in various high stress and temperature applications such as plastic injection molds and components in steam generators. Subtractive manufacturing methods used to fabricate these parts, however, limits its functionality and performance due to design constraint of cooling channels. This limitation can be resolved by means of additive manufacturing while ensuring that acceptable high-temperature properties can be achieved. In this work, the mechanical behavior of additively manufactured 420 stainless steel (AM420SS) is explored through material constitutive modeling to determine the mathematical model that best describes its flow stress in extreme conditions. This is accomplished by subjecting the samples to hot compression under the strain rates of 0.1–1.0 s−1, and temperatures between 973–1423 K (700 °C–1150 °C) via Gleeble thermomechanical test. The experimental data were used to generate the predictive flow stress curves of constitutive models which includes Johnson-Cook, Zerilli-Armstrong, Zener-Hollomon, and Hensel-Spittel equations. Results showed that Zener-Hollomon and Hensel-Spittel models are the most accurate material constitutive equations with relatively high R values of 0.986 and 0.976, and low average absolute relative error values of 6.96% and 7.69%, respectively. The material constants derived from these models can be applied in finite element analysis simulations to assess the performance of using AM420SS parts at high temperature and strain conditions.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.