Accelerating on-device visual task adaptation by exploiting hybrid sparsity in DNN training

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Accelerating on-device visual task adaptation by exploiting hybrid sparsity in DNN training

Similar Papers
  • Conference Article
  • Cite Count Icon 5
  • 10.1109/iccad51958.2021.9643522
A Convergence Monitoring Method for DNN Training of On-Device Task Adaptation
  • Nov 1, 2021
  • Seungkyu Choi + 2 more

DNN training has become a major workload in on-device situations to execute various vision tasks with high performance. Accordingly, training architectures accompanying approximate computing have been steadily studied for efficient acceleration. However, most of the works examine their scheme on from-the-scratch training where inaccurate computing is not tolerable. Moreover, previous solutions are mostly provided as an extended version of the inference works, e.g., sparsity/pruning, quantization, dataflow, etc. Therefore, unresolved issues in practical workloads that hinder the total speed of the DNN training process remain still. In this work, with targeting the transfer learning-based task adaptation of the practical on-device training workload, we propose a convergence monitoring method to resolve the redundancy in massive training iterations. By utilizing the network's output value, we detect the training intensity of incoming tasks and monitor the prediction convergence with the given intensity to provide early-exits in the scheduled training iteration. As a result, an accurate approximation over various tasks is performed with minimal overhead. Unlike the sparsity-driven approximation, our method enables runtime optimization and can be easily applicable to off-the-shelf accelerators achieving significant speedup. Evaluation results on various datasets show a geomean of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$2.2\times$</tex> speedup over baseline and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$1.8\times$</tex> speedup over the latest convergence-related training method.

  • Conference Article
  • Cite Count Icon 14
  • 10.1109/smartcomp50058.2020.00053
A Novel Posit-based Fast Approximation of ELU Activation Function for Deep Neural Networks
  • Sep 1, 2020
  • Marco Cococcioni + 3 more

Nowadays, real-time applications are exploiting DNNs more and more for computer vision and image recognition tasks. Such kind of applications are posing strict constraints in terms of both fast and efficient information representation and processing. New formats for representing real numbers have been proposed and among them the Posit format appears to be very promising, providing means to implement fast approximated version of widely used activation functions in DNNs. Moreover, information processing performance are continuously improved thanks to advanced vectorized SIMD (single-instruction multiple-data) processor architectures and instructions like ARM SVE (Scalable Vector Extension). This paper explores both approaches (Posit-based implementation of activation functions and vectorized SIMD processor architectures) to obtain faster DNNs. The two proposed techniques are able to speed up both DNN training and inference steps.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tcad.2022.3206394
Accelerating On-Device DNN Training Workloads via Runtime Convergence Monitor
  • May 1, 2023
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Seungkyu Choi + 2 more

With the growing demand for processing deep learning applications on edge devices, on-device DNN training has become a major workload to execute a variety of vision tasks suited for users. Therefore, architectures employed with algorithm co-design to accelerate the training process have been steadily studied. However, previous solutions are mostly supported by extended versions of the inference studies, such as sparsity, data flow, quantization, etc. Moreover, most works examine their schemes on from-the-scratch training that cannot tolerate inaccurate computing. Accordingly, there are still factors that hinder the overall speed of the DNN training process that has not been addressed in practical workloads. In this work, we propose a runtime convergence monitor to achieve massive computational savings in the practical on-device training workloads (i.e., transfer learning-based task adaptation). By monitoring the network output data, we determine the training intensity of incoming tasks and adaptively detect the convergence in iteration intervals for training diverse datasets. Furthermore, we enable computation skip of converged images determined by the monitored prediction probability to enhance the training speed within an iteration. As a result, we perform an accurate but fast convergence in model training for the task adaptation with minimal overhead. Unlike the previous approximation methods, our monitoring system enables runtime optimization and can be easily applicable to any type of accelerator attaining significant speedup. Evaluation results on various datasets show geomean of 2.2× speedup when applied in any systolic architectures and further enhancement of 3.6× when applied in accelerators dedicated for on-device training.

  • Book Chapter
  • 10.1201/9781003305002-17
Analysis on the change law of visual sense adaptation in high altitude highway tunnel entrance section
  • Oct 11, 2022
  • Pengsheng Li + 2 more

To study the relationship between driver's visual adaptation at the entrance of a high-altitude super long tunnel and driver's decision-making behavior, a relationship model between illumination change rate and speed of tunnel entrance section based on driver's visual characteristics under high-altitude driving environment was constructed. Taking Wuyu Tianshan Shengli tunnel as the research object, the relationship between the illumination change rate before the driver approaches the tunnel portal and the driver's driving speed is analyzed. Through the real vehicle test in gezigou tunnel, the driver's speed and acceleration data and the illumination change data from 500 m in front of the tunnel to 200 m at the entrance of the tunnel are collected. This paper explores the change law of the driver's visual perception to complete a series of visual tasks while driving in a high-altitude highway tunnel from searching the tunnel entrance to driving into the tunnel entrance. By testing the driver's speed reduction rate and the illumination reduction rate of the corresponding distance, it is studied that the illumination change significantly affects the driver's visual perception, and 500 m before entering the tunnel The entrance section of 30–50 m has the greatest impact on the visual perception of the tested driver. Providing experimental support for the compensation of visual illuminance at the tunnel entrance can improve the driving safety of highway tunnel lighting.

  • Research Article
  • Cite Count Icon 4
  • 10.1038/s41537-022-00240-0
Early visual processing and adaptation as markers of disease, not vulnerability: EEG evidence from 22q11.2 deletion syndrome, a population at high risk for schizophrenia
  • Mar 21, 2022
  • Schizophrenia
  • Ana A Francisco + 3 more

We investigated visual processing and adaptation in 22q11.2 deletion syndrome (22q11.2DS), a condition characterized by an increased risk for schizophrenia. Visual processing differences have been described in schizophrenia but remain understudied early in the disease course. Electrophysiology was recorded during a visual adaptation task with different interstimulus intervals to investigate visual processing and adaptation in 22q11.2DS (with (22q+) and without (22q−) psychotic symptoms), compared to control and idiopathic schizophrenia groups. Analyses focused on early windows of visual processing. While increased amplitudes were observed in 22q11.2DS in an earlier time window (90–140 ms), decreased responses were seen later (165–205 ms) in schizophrenia and 22q+. 22q11.2DS, and particularly 22q−, presented increased adaptation effects. We argue that while amplitude and adaptation in the earlier time window may reflect specific neurogenetic aspects associated with a deletion in chromosome 22, amplitude in the later window may be a marker of the presence of psychosis and/or of its chronicity/severity.

  • Research Article
  • Cite Count Icon 1
  • 10.1101/2025.03.07.642102
Visual adaptation stronger at horizontal than vertical meridian: Linking performance with V1 cortical surface area
  • May 14, 2025
  • bioRxiv
  • Hsing-Hao Lee + 1 more

Visual adaptation reduces bioenergetic expenditure by decreasing sensitivity to repetitive and similar stimuli. In human adults, visual performance varies systematically around polar angle for many visual dimensions and tasks: Performance is superior along the horizontal than the vertical meridian (horizontal-vertical anisotropy, HVA), and the lower than upper vertical meridian (vertical meridian asymmetry, VMA). These asymmetries are resistant to spatial and temporal attention. However, it remains unknown whether visual adaptation differs around polar angle. Here, we investigated how adaptation influences contrast sensitivity at the fovea and perifovea across the four cardinal meridian locations, for both horizontal and vertical stimuli in an orientation discrimination task. In the non-adapted conditions, the HVA was more pronounced for horizontal than vertical stimuli. For both orientations, adaptation was stronger along the horizontal than vertical meridian, exceeding foveal adaptation. Additionally, perifoveal adaptation effects positively correlated with individual V1 cortical surface area. These findings reveal that visual adaptation mitigates the HVA in contrast sensitivity, fostering perceptual uniformity around the visual field while conserving bioenergetic resources.

  • Research Article
  • Cite Count Icon 1
  • 10.1073/pnas.2507810122
Visual adaptation stronger at the horizontal than the vertical meridian: Linking performance with V1 cortical surface area
  • Jul 14, 2025
  • Proceedings of the National Academy of Sciences
  • Hsing-Hao Lee + 1 more

Visual adaptation reduces bioenergetic expenditure by decreasing sensitivity to repetitive and similar stimuli. In human adults, visual performance varies systematically around the polar angle for many visual dimensions and tasks: Performance is superior along the horizontal than the vertical meridian (horizontal-vertical anisotropy, HVA) and the lower than upper vertical meridian (vertical meridian asymmetry, VMA). These asymmetries are resistant to spatial and temporal attention. However, it remains unknown whether visual adaptation differs around the polar angle. Here, we investigated how adaptation influences contrast sensitivity at the fovea and perifovea across the four cardinal meridian locations for both horizontal and vertical stimuli in an orientation discrimination task. In the nonadapted conditions, the HVA was more pronounced for horizontal than vertical stimuli. For both orientations, adaptation was stronger along the horizontal than the vertical meridian, exceeding foveal adaptation. Additionally, perifoveal adaptation effects positively correlated with individual V1 cortical surface area. These findings reveal that visual adaptation mitigates the HVA in contrast sensitivity, fostering perceptual uniformity around the visual field while conserving bioenergetic resources.

  • Research Article
  • 10.1080/13506285.2024.2407873
Negative aftereffects of face trait impressions are modulated by emotional expressions
  • Mar 15, 2024
  • Visual Cognition
  • Fiammetta Marini + 4 more

Facial trustworthiness impressions critically shape our everyday social interactions. While previous research has predominantly considered trustworthiness impressions to be stable over time, preliminary evidence has shown that they are affected by visual adaptation, such that long exposure to (un)trustworthy-looking faces biases the perception of following faces in the opposite trustworthiness direction. Here, by employing a visual adaptation task across two experiments, we sought further evidence that trustworthiness impressions are shaped by the temporal context. In Experiment 1, we investigated whether visual adaptation affect trustworthiness judgements and found evidence of robust negative face aftereffect. In Experiment 2, we focused our investigation on whether emotional expressions, key cues involved in trait impressions, influence trustworthiness and dominance impressions. We found that adaptation to anti-expressions, which were expected to bias subsequent neutral faces to resemble the original expression (happiness, anger, and fear), significantly modulated subsequent evaluations of trustworthiness and dominance. This result confirms the critical role of emotion perception in trait evaluations. Importantly, using anti-expressions minimised semantic adaptation, thus highlighting the perceptual nature of this aftereffect. Taken together, our findings confirm that temporal context shapes trustworthiness impressions, by showing that visual adaptation affects trust judgements, and that past emotional expressions influence following impressions of trustworthiness and dominance.

  • Conference Article
  • Cite Count Icon 220
  • 10.1109/cvpr.2012.6247924
Robust visual domain adaptation with low-rank reconstruction
  • Jun 1, 2012
  • I-Hong Jhuo + 3 more

Visual domain adaptation addresses the problem of adapting the sample distribution of the source domain to the target domain, where the recognition task is intended but the data distributions are different. In this paper, we present a low-rank reconstruction method to reduce the domain distribution disparity. Specifically, we transform the visual samples in the source domain into an intermediate representation such that each transformed source sample can be linearly reconstructed by the samples of the target domain. Unlike the existing work, our method captures the intrinsic relatedness of the source samples during the adaptation process while uncovering the noises and outliers in the source domain that cannot be adapted, making it more robust than previous methods. We formulate our problem as a constrained nuclear norm and ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2, 1</sub> norm minimization objective and then adopt the Augmented Lagrange Multiplier (ALM) method for the optimization. Extensive experiments on various visual adaptation tasks show that the proposed method consistently and significantly beats the state-of-the-art domain adaptation methods.

  • Research Article
  • Cite Count Icon 12
  • 10.1007/s00221-010-2204-8
A visual distracter task during adaptation reduces the proprioceptive movement aftereffect
  • Mar 11, 2010
  • Experimental Brain Research
  • Tatjana Seizova-Cajic + 1 more

Visual processing of basic perceptual attributes depends on attention. This has been well documented since the surprising initial report on attentional modulation of the visual motion aftereffect (Chaudhuri 1990). Here, we investigate proprioception and show for the first time that attention modulates adaptation to perceived limb movement. We used biceps vibration to induce illusory forearm extension in 10 participants and measured the aftereffect-perceived movement in the opposite direction. The aftereffect was largest when participants focused on the illusory extension during the adaptation period. To divert attention away from the illusory extension, a rapid serial visual presentation task was performed during the adaptation. The aftereffect was much smaller in this condition, indicating interference between the visual task and proprioceptive adaptation. In tests of an analogous interaction between audition and vision, earlier research found no effect. We suggest that conscious proprioception requires more attention than conscious processing of visual or auditory input.

  • Research Article
  • Cite Count Icon 2
  • 10.1038/s41598-024-75710-9
Generalization in perceptual learning across stimuli and tasks
  • Oct 19, 2024
  • Scientific Reports
  • Ravit Kahalani-Hodedany + 3 more

Perceptual learning, known to improve visual perception, demonstrates the plasticity of brain processes underlying vision. Early studies, using the backward-masked texture discrimination task (TDT), focused on the lack of generalizing learning to stimulus features, relating learning specificity to the selectivity of the brain networks involved in the visual task. Learning was found to be highly specific to the stimulus features, as expected from the processing selectivity found in early visual areas as well as to the task employed in training, pointing to top-down effects. More recent studies demonstrate the generalization of learning to untrained features under specifically designed training procedures. Here we suggest that transfer of learning takes place when the trained and untrained stimuli and task activate overlapping brain processes. We tested the effect of TDT learning, under conditions with and without visual adaptation, on the contrast detection (CD) of localized Gabor targets, either alone or backward masked (BM). At the TDT peripheral-target location, we found that the transfer of learning between TDT to CD and BM occurs under the TDT adaptation condition, but not under the no-adaptation condition, whereas at the TDT center-target location we found that transfer occurs for both conditions. Our results suggest that learning generalization across experimental conditions depends on overlapping neural processes within brain networks, here dominated by the inhibitory effects involved in adaptation and in spatiotemporal masking. Importantly, increased adaptation during training, due to increased stimulus consistency, enabled the transfer of learning to other tasks limited by sensory adaptation.

  • Research Article
  • Cite Count Icon 51
  • 10.1007/s10072-015-2076-6
Reading beyond the glance: eye tracking in neurosciences.
  • Jan 22, 2015
  • Neurological Sciences
  • Livia Popa + 5 more

From an interdisciplinary approach, the neurosciences (NSs) represent the junction of many fields (biology, chemistry, medicine, computer science, and psychology) and aim to explore the structural and functional aspects of the nervous system. Among modern neurophysiological methods that "measure" different processes of the human brain to salience stimuli, a special place belongs to eye tracking (ET). By detecting eye position, gaze direction, sequence of eye movement and visual adaptation during cognitive activities, ET is an effective tool for experimental psychology and neurological research. It provides a quantitative and qualitative analysis of the gaze, which is very useful in understanding choice behavior and perceptual decision making. In the high tech era, ET has several applications related to the interaction between humans and computers. Herein, ET is used to evaluate the spatial orienting of attention, the performance in visual tasks, the reactions to information on websites, the customer response to advertising, and the emotional and cognitive impact of various spurs to the brain.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.3389/fnint.2012.00009
Social categories shape the neural representation of emotion: evidence from a visual face adaptation task
  • Feb 29, 2012
  • Frontiers in Integrative Neuroscience
  • Marte Otten + 1 more

A number of recent behavioral studies have shown that emotional expressions are differently perceived depending on the race of a face, and that perception of race cues is influenced by emotional expressions. However, neural processes related to the perception of invariant cues that indicate the identity of a face (such as race) are often described to proceed independently of processes related to the perception of cues that can vary over time (such as emotion). Using a visual face adaptation paradigm, we tested whether these behavioral interactions between emotion and race also reflect interdependent neural representation of emotion and race. We compared visual emotion aftereffects when the adapting face and ambiguous test face differed in race or not. Emotion aftereffects were much smaller in different race (DR) trials than same race (SR) trials, indicating that the neural representation of a facial expression is significantly different depending on whether the emotional face is black or white. It thus seems that invariable cues such as race interact with variable face cues such as emotion not just at a response level, but also at the level of perception and neural representation.

  • Research Article
  • Cite Count Icon 5
  • 10.1049/iet-ipr.2018.6687
Adversarial auto‐encoder for unsupervised deep domain adaptation
  • Oct 31, 2019
  • IET Image Processing
  • Rui Shao + 1 more

Unsupervised visual domain adaptation aims to train a classifier that works well on a target domain given labelled source samples and unlabelled target samples. The key issue in unsupervised visual domain adaptation is how to do the feature alignment between source and target domains. Inspired by the adversarial learning in generative adversarial networks, this study proposes a novel adversarial auto-encoder for unsupervised deep domain adaptation. This method incorporates the auto-encoder with the adversarial learning so that the domain similarity and reconstruction information from the decoder can be exploited to facilitate the adversarial domain adaptation in the encoder. Extensive experiments on various visual recognition tasks show that the proposed method performs favourably against or better than competitive state-of-the-art methods.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/access.2020.3035422
Linear Discriminant Analysis via Pseudo Labels: A Unified Framework for Visual Domain Adaptation
  • Jan 1, 2020
  • IEEE Access
  • Rakesh Kumar Sanodiya + 1 more

This paper deals with the problem of visual domain adaptation in which source domain labeled data is available for training, but the target domain unlabeled data is available for testing. Many recent domain adaptation methods merely concentrate on extracting domain-invariant features via minimizing the distributional and geometrical divergence between domains simultaneously while ignoring within-class and between-class structure properties, especially for the target domain due to the unavailability of labeled data. We propose Linear Discriminant Analysis via Pseudo Labels (LDAPL), a unified framework for visual domain adaptation that can tackle these two issues together. LDAPL is to learn a domain-invariant features across both domains with preserving important properties such as minimizing the shift between domains both statistically and geometrically, retaining the original similarity of data samples, maximizing the target domain variance, and minimizing the within-class and maximizing the between-class properties of both domains. Specifically, LDAPL preserves the target domain discriminative information (or within-class and between-class properties) using pseudo labels, and these pseudo labels are refined until convergence. Extensive experiment on several visual cross-domain benchmarks, including Office+Caltech10 with all three types of features (such as Speeded Up Robust Features (SURF), Deep Convolutional Activation Feature (DeCAF <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</sub> ), and Visual Geometry Group-Fully Connected layer ( VGG-FC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</sub> ) features), COIL20 (Columbia Object Image Library), digit, and PIE (Pose, Illumination, and Expression), LDAPL achieved average accuracies of 79.11%, 99.72%, 79.0%, and 84.50%, respectively. Comparative results on several visual cross-domain classification tasks verify that LDAPL can significantly outperform the state-of-the-art primitive and domain adaptation methods. Specifically, LDAPL gains over baseline Joint Geometrical and Statistical Alignment (JGSA) method with 6.6%, 5.3%, 6.3%, and 44.93% average accuracies on Office+Caltech10 (SURF, DeCAF <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</sub> , and VGG-FC6 ), COIL20, digit, and PIE, respectively.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon