DeFFusion: CNN-based Continuous Authentication Using Deep Feature Fusion
Smartphones have become crucial and important in our daily life, but the security and privacy issues have been major concerns of smartphone users. In this article, we present DeFFusion, a CNN-based continuous authentication system using Deep Feature Fusion for smartphone users by leveraging the accelerometer and gyroscope ubiquitously built into smartphones. With the collected data, DeFFusion first converts the time domain data into frequency domain data using the fast Fourier transform and then inputs both of them into a designed CNN, respectively. With the CNN-extracted features, DeFFusion conducts the feature selection utilizing factor analysis and exploits balanced feature concatenation to fuse these deep features. Based on the one-class SVM classifier, DeFFusion authenticates current users as a legitimate user or an impostor. We evaluate the authentication performance of DeFFusion in terms of impact of training data size and time window size, accuracy comparison on different features over different classifiers and on different classifiers with the same CNN-extracted features, accuracy on unseen users, time efficiency, and comparison with representative authentication methods. The experimental results demonstrate that DeFFusion has the best accuracy by achieving the mean equal error rate of 1.00% in a 5-second time window size.
- Research Article
33
- 10.1109/tmc.2022.3186614
- Oct 1, 2023
- IEEE Transactions on Mobile Computing
Mobile devices are becoming increasingly popular and are playing significant roles in our daily lives. Insufficient security and weak protection mechanisms, however, cause serious privacy leakage of the unattended devices. To fully protect mobile device privacy, we propose ADFFDA, a novel mobile continuous authentication system using an Adaptive Deep Feature Fusion scheme for effective feature representation, and a transformer-based GAN for Data Augmentation, by leveraging smartphone built-in sensors of the accelerometer, gyroscope and magnetometer. Given the normalized sensor data, ADFFDA utilizes the transformer-based GAN consisting of a transformer-based generator and a CNN-based discriminator to augment the training data for CNN training. With the augmented data and the especially designed CNN based on the ghost module and ghost bottleneck, ADFFDA extracts deep features from the three sensors by the trained CNN, and exploits an adaptive-weighted concatenation method to adaptively fuse the CNN-extracted features. Based on the fused features, ADFFDA authenticates users by using the one-class SVM (OC-SVM) classifier. We evaluate the authentication performance of ADFFDA in terms of the efficiency of the transformer-based GAN, GAN-based data augmentation, CNN architecture, adaptive-weighted feature fusion, and OC-SVM classifier. The experimental results show that ADFFDA obtains the best authentication performance w.r.t representative approaches, by achieving a mean equal error rate of 0.01%.
- Research Article
19
- 10.1007/s00500-024-09866-x
- Jul 23, 2024
- Soft Computing
This paper proposed a novel approach for detecting lung sound disorders using deep learning feature fusion. The lung sound dataset are oversampled and converted into spectrogram images. Then, extracting deep features from CNN architectures, which are pre-trained on large-scale image datasets. These deep features capture rich representations of spectrogram images from the input signals, allowing for a comprehensive analysis of lung disorders. Next, a fusion technique is employed to combine the extracted features from multiple CNN architectures totlaly 8064 feature. This fusion process enhances the discriminative power of the features, facilitating more accurate and robust detection of lung disorders. To further improve the detection performance, an improved CNN Architecture is employed. To evaluate the effectiveness of the proposed approach, an experiments conducted on a large dataset of lung disorder signals. The results demonstrate that the deep feature fusion from different CNN architectures, combined with different CNN Layers, achieves superior performance in lung disorder detection. Compared to individual CNN architectures, the proposed approach achieves higher accuracy, sensitivity, and specificity, effectively reducing false negatives and false positives. The proposed model achieves 96.03% accuracy, 96.53% Sensitivity, 99.424% specificity, 96.52% precision, and 96.50% F1 Score when predicting lung diseases from sound files. This approach has the potential to assist healthcare professionals in the early detection and diagnosis of lung disorders, ultimately leading to improved patient outcomes and enhanced healthcare practices.
- Research Article
25
- 10.3390/s22072801
- Apr 6, 2022
- Sensors (Basel, Switzerland)
Cancer is the deadliest disease among all the diseases and the main cause of human mortality. Several types of cancer sicken the human body and affect organs. Among all the types of cancer, stomach cancer is the most dangerous disease that spreads rapidly and needs to be diagnosed at an early stage. The early diagnosis of stomach cancer is essential to reduce the mortality rate. The manual diagnosis process is time-consuming, requires many tests, and the availability of an expert doctor. Therefore, automated techniques are required to diagnose stomach infections from endoscopic images. Many computerized techniques have been introduced in the literature but due to a few challenges (i.e., high similarity among the healthy and infected regions, irrelevant features extraction, and so on), there is much room to improve the accuracy and reduce the computational time. In this paper, a deep-learning-based stomach disease classification method employing deep feature extraction, fusion, and optimization using WCE images is proposed. The proposed method comprises several phases: data augmentation performed to increase the dataset images, deep transfer learning adopted for deep features extraction, feature fusion performed on deep extracted features, fused feature matrix optimized with a modified dragonfly optimization method, and final classification of the stomach disease was performed. The features extraction phase employed two pre-trained deep CNN models (Inception v3 and DenseNet-201) performing activation on feature derivation layers. Later, the parallel concatenation was performed on deep-derived features and optimized using the meta-heuristic method named the dragonfly algorithm. The optimized feature matrix was classified by employing machine-learning algorithms and achieved an accuracy of 99.8% on the combined stomach disease dataset. A comparison has been conducted with state-of-the-art techniques and shows improved accuracy.
- Research Article
31
- 10.1109/jiot.2021.3108822
- Apr 1, 2022
- IEEE Internet of Things Journal
With the widespread usage of mobile devices, the authentication mechanisms are urgently needed to identify users for information leakage prevention. In this article, we present CAGANet, a convolutional neural network (CNN)-based continuous authentication on smartphones using a conditional Wasserstein generative adversarial network (CWGAN) for data augmentation, which utilizes smartphone sensors of the accelerometer, gyroscope, and magnetometer to sense phone movements incurred by user operation behaviors. Specifically, based on the preprocessed real data, CAGANet employs CWGAN to generate additional sensor data for data augmentation that are used to train the designed CNN. With the augmented data, CAGANet utilizes the trained CNN to extract deep features and then performs principal component analysis (PCA) to select appropriate representative features for different classifiers. With the CNN-extracted features, CAGANet trains four one-class classifiers of OC-SVM, LOF, isolation forest (IF), and EE in the enrollment phase and authenticates the current user as a legitimate user or an impostor based on the trained classifiers in the authentication phase. To evaluate the performance of CAGANet, we conduct extensive experiments in terms of the efficiency of CWGAN, the effectiveness of CWGAN augmentation and the designed CNN, the accuracy on unseen users, and comparison with traditional augmentation approaches and with representative authentication methods, respectively. The experimental results show that CAGANet with the IF classifier can achieve the lowest equal error rate (EER) of 3.64% on 2-s sampling data.
- Research Article
77
- 10.1080/03772063.2022.2028584
- Feb 5, 2022
- IETE Journal of Research
Breast cancer is one of the deadly cancer types that causes high mortality among women globally. Meanwhile, Deep Learning (DL) emerges as the most frequently utilized and rapidly developing branch of classical machine learning. The study examines a modern Computer-Aided Diagnosis (CAD) framework that uses DL to extract features and classify them for aiding radiologists in breast cancer diagnosis. This is accomplished through four distinct experimentations aimed at identifying the most optimal method of effective classification. Here, the first uses Deep CNNs that are pre-trained, such as AlexNet, GoogleNet, ResNet50, and Dense-Net121. The second is based on experiments using Deep CNNs to extract features and applying them onto a Support Vector Machine algorithm using three different kernels. The next one involves the fusion of different deep features for demonstrating the classification improvement by fusion of these deep features. The final experiment involves Principal Component Analysis (PCA) for reducing the computational cost and for decreasing the larger feature vectors created during fusion. The abovesaid experimentations are carried out in two different mammogram datasets namely MIAS and INbreast. The classification accuracy attained for both datasets through the fuzing of deep features (97.93% for MIAS and 96.646% for INbreast) is the highest compared with the state-of-the-art frameworks. In contrast, the classification performance did not enhance while applying the PCA on combined deep features; but the decrease in execution time provides a reduced computational cost. Abbreviations: CAD: Computer Aided Diagnosis; CNN: Convolution Neural Network; CSI: Classification Success Index; DCNN: Deep Convolution Neural Network; DICOM: Digital Imaging and Communications in Medicine; DL: Deep Learning; FC layer: Fully Connected layer; FFDM: Full-Field Digital Mammograms; FN: False Negative; FP: False Positive; ICSI: Individual Classification Success Index; MIAS: Mammographic Image Analysis Society; ML: Machine Learning; MLO: Medio-Lateral Oblique; PCA: Principal Component Analysis; PGM: Portable Gray Map; PPV: Positive Predictive Value; RBF: Radial Basis Function; SGDM: Stochastic Gradient Descent with Momentum; SVM: Support Vector Machine; TN: True Negative; TP: True Positive; TPR: True Positive Rate; UK: United Kingdom
- Research Article
2
- 10.1007/s12194-025-00932-z
- Jul 3, 2025
- Radiological physics and technology
This paper introduces a Content-Based Medical Image Retrieval (CBMIR) system for detecting and retrieving lung disease cases to assist doctors and radiologists in clinical decision-making. The system combines texture-based features using Local Binary Patterns (LBP) with deep learning-based features extracted from pretrained CNN models, including VGG-16, DenseNet121, and InceptionV3. The objective is to identify the optimal fusion of texture and deep features to enhance the image retrieval performance. Various similarity measures, including Euclidean, Manhattan, and cosine similarities, were evaluated, with Cosine Similarity demonstrating the best performance, achieving an average precision of 65.5%. For COVID-19 cases, VGG-16 achieved a precision of 52.5%, while LBP performed best for the normal class with 85% precision. The fusion of LBP, VGG-16, and DenseNet121 excelled in pneumonia cases, with a precision of 93.5%. Overall, VGG-16 delivered the highest average precision of 74.0% across all classes, followed by LBP at 72.0%. The fusion of texture (LBP) and deep features from all CNN models achieved 86% accuracy for the retrieval of the top 10 images, supporting healthcare professionals in making more informed clinical decisions.
- Research Article
14
- 10.14778/3565816.3565829
- Oct 1, 2022
- Proceedings of the VLDB Endowment
Frequency domain analysis is widely conducted on time series. While online transforming from time domain to frequency domain is costly, e.g., by Fast Fourier Transform (FFT), it is highly demanded to store the frequency domain data for reuse. However, frequency domain data encoding for efficient storage is surprisingly untouched. We notice that (1) the precision of data value is unnecessarily high after transforming to frequency domain and (2) the data values are with skewed distribution leading to a very large bit width for encoding. To avoid such space waste in both precision and skewness, we devise a descending bit-packing encoding for frequency domain data. Specifically, we quantize the data values in proper precision referring to the signal-noise-ratio (SNR) in frequency domain analysis. Moreover, we sort the data values in descending order so that the bit width could be dynamically reduced in encoding. The method has been deployed in Apache IoTDB, an open-source time-series database, not only for directly encoding frequency domain data, but also as a lossy compression of the time domain data. The extensive experiments on the system demonstrate the superiority of our encoding for both frequency domain and time domain data.
- Research Article
86
- 10.1145/3397179
- Jul 21, 2020
- ACM Transactions on Sensor Networks
Continuous authentication monitors the security of a system throughout the login session on mobile devices. In this article, we present SCANet, a two-stream convolutional neural network--based continuous authentication system that leverages the accelerometer and gyroscope on smartphones to monitor users’ behavioral patterns. We are among the first to use two streams of data—frequency domain data and temporal difference domain data—from the two sensors as the inputs of the convolutional neural network (CNN). SCANet utilizes the two-stream CNN to learn and extract representative features and then performs the principal component analysis to select the top 25 features with high discriminability. With the CNN-extracted features, SCANet exploits the one-class support vector machine to train the classifier in the enrollment phase. Based on the trained CNN and classifier, SCANet identifies the current user as a legitimate user or an impostor in the continuous authentication phase. We evaluate the effectiveness of the two-stream CNN and the performance of SCANet on our dataset and BrainRun dataset, and the experimental results demonstrate that CNN achieves 90.04% accuracy, and SCANet reaches an average of 5.14% equal error rate on two datasets and takes approximately 3 s for user authentication.
- Research Article
- 10.1049/2024/5683547
- Jan 1, 2024
- IET Biometrics
Contactless palmprint recognition offers friendly customer experience due to its ability to operate without touching the recognition device under rigid constrained conditions. Recent palmprint recognition methods have shown promising accuracy; however, there still exist some issues that need to be further studied such as the limited discrimination of the single feature and how to effectively fuse deep features and shallow features. In this paper, deep features and shallow features are integrated into a unified framework using feature‐level and score‐level fusion methods. Specifically, deep feature is extracted by residual neural network (ResNet), and shallow features are extracted by principal component analysis (PCA), linear discriminant analysis (LDA), and competitive coding (CompCode). In feature‐level fusion stage, ResNet feature and PCA feature are dimensionally reduced and fused by canonical correlation analysis technique to achieve the fused feature for the next stage. In score‐level fusion stage, score information is embedded in the fused feature, LDA feature, and CompCode feature to obtain a more reliable and robust recognition performance. The proposed method achieves competitive performance on Tongji dataset and demonstrates more satisfying generalization capabilities on IITD and CASIA datasets. Comprehensive validation across three palmprint datasets confirms the effectiveness of our proposed deep and shallow feature fusion approach.
- Research Article
- 10.1016/j.compbiomed.2025.110993
- Oct 1, 2025
- Computers in biology and medicine
Congenital heart disease (CHD) is the most common type of birth defect, impacting about 1% of live births worldwide. Echocardiography, the gold-standard diagnostic method, is costly and inaccessible in low-resource settings. Diagnosis is delayed due to limited skilled experts, whose ability to interpret pathological patterns varies significantly, causing inter- and intra-clinician variability. Therefore, we present a new method for a more accessible diagnostic modality, the digital stethoscope, to detect CHDs. Our method is based on deep feature fusion, integrating deep and handcrafted features for the automated early detection of CHDs. For this work, Phonocardiography (PCG) recordings were obtained from 751 pediatric subjects (Age:1 month- 16 years) in Bangladesh, ranging from infants to adults at four auscultation locations: mitral valve (MV), aortic valve (AV), pulmonary valve (PV), and tricuspid valve (TV). These recordings were labeled based on confirmed diagnoses by cardiologists as either cases of CHD or non-CHD. The results demonstrated that our proposed model achieved an accuracy of 92%, a sensitivity of 91%, and a specificity of 91%, based on a patient-wise split of 70% training, 20% validation, and 10% testing. Furthermore, the Area Under the Receiver Operating Characteristic curve (AUROC) of 96%, and an F1-score of 92%. This model promises efficient real-time remote detection of CHDs as a cost-effective screening tool for low-resource settings.
- Research Article
1
- 10.1149/1945-7111/ae0983
- Sep 1, 2025
- Journal of The Electrochemical Society
With the rapid growth in the use of lithium-ion batteries (LIBs), particularly in electric vehicles, accurate state of health (SOH) estimation has become essential for ensuring battery safety and reliability. This study proposes a hybrid framework for SOH prediction that integrates advanced signal decomposition with deep temporal feature fusion. The approach combines improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), grey wolf optimization (GWO)-based variational mode decomposition (VMD), and a Transformer-BiLSTM network enhanced by a cross-attention mechanism. The methodology begins by applying ICEEMDAN to decompose the original battery signals into intrinsic mode functions (IMFs). IMFs with the highest multi-scale permutation entropy (MPE) are selected for further analysis, reflecting their rich dynamic information. These selected components are then decomposed again using VMD, with parameters tuned via GWO to adaptively capture key signal patterns. The resulting features are fed into a Transformer-BiLSTM network to learn both global and local temporal dependencies. A cross-attention layer is employed to enhance feature integration by weighing informative temporal dimensions. Experimental evaluations on two public datasets—NASA and Oxford—demonstrate that the proposed method significantly outperforms baseline models in SOH prediction accuracy. The results confirm the effectiveness of combining signal decomposition and deep feature fusion for robust battery health estimation.
- Book Chapter
- 10.1007/978-3-030-29911-8_16
- Jan 1, 2019
State-of-the-art (STOA) person re-identification (re-ID) methods measure features extracted by deep CNNs for final evaluation. In this work, we aim to improve re-ID performance by better utilizing these deep features. Firstly, a Dynamic Re-ranking (DRR) method is proposed, which matches features based on neighborhood structure to utilize contextual information. Different from common re-ranking methods, it finds more matches by adding contextual information. Secondly, to exploit the diverse information embedded in the deep features, we introduce Deep Feature Fusion (DFF), which splits and combines deep features through a diffusion and fusion process. Extensive comparative evaluations on three large re-ID benchmarks and six well-known features show that DRR and DFF are effective and insensitive to parameter setting. With a proper integration strategy, DRR and DFF can achieve STOA re-ID performance.
- Book Chapter
- 10.1007/978-3-031-11349-9_19
- Jan 1, 2022
We propose a robust and efficient blind video quality assessment model using fusion of novel structural features and deep semantic features. As the human visual system (HVS) is very sensitive to the structural contents in a visual scene, we come up with a novel structural feature extractor that uses a two-level encoding scheme. In addition, we employ a pre-trained Convolutional Neural Network (CNN) model Inception-v3 that extracts semantic features from the sampled video frames. Further, structural and deep semantic features are concatenated and applied to a support vector regression (SVR) that predicts the final visual quality scores of the videos. The performance of the proposed method is validated on three popular and widely used authentic distortions datasets, LIVE-VQC, KoNViD-1k, and LIVE Qualcomm. Results show excellent performance of the proposed model compared with other state-of-the-art methods with significantly reduced computational burden.KeywordsStructural featuresSupport vector regressionConvolutional Neural Network
- Research Article
60
- 10.1016/j.inffus.2020.05.005
- May 14, 2020
- Information Fusion
Deep feature fusion through adaptive discriminative metric learning for scene recognition
- Research Article
51
- 10.1016/j.knosys.2021.107473
- Sep 10, 2021
- Knowledge-Based Systems
Exploring deep features and ECG attributes to detect cardiac rhythm classes