Teacher Network Research Articles

Most existing 3D object classification and retrieval algorithms rely on one-off supervised learning on closed 3D object sets and tend to provide rigid convolutional neural networks with little scalability. Such limitations substantially restrict their potential to learn newly emerged 3D object classes continually in the real world. Aiming to go beyond these limitations, we innovatively propose two new and challenging tasks: class-incremental 3D object classification ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CI-3DOC</i> ) and class-incremental 3D object retrieval ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CI-3DOR</i> ), the key to which is class-incremental 3D representation learning. It expects the network to update continually to learn new 3D class representations without forgetting the previously learned ones. To this end, we design a novel balanced distillation network <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(BDNet)</i> that uses a dual supervision mechanism to balance between consolidating old knowledge (stability) and adapting to new 3D object classes (plasticity) carefully. On the one hand, we employ stability-based supervision to retain the stable and discriminative information of old classes that greatly benefit both classification and retrieval tasks. On the other hand, we use plasticity-based supervision to improve the network's generalization for learning new class 3D representations by transferring knowledge from a temporary teacher network to the current model. By properly handling the relationship between the two modules, we achieve a surprising performance improvement. Furthermore, considering there is no available dataset for evaluation, we build two 3D datasets, INOR-1 and INOR-2, to evaluate these two new tasks. Extensive experimental results demonstrate that our method can significantly outperform other state-of-the-art class-incremental learning methods. Even if we store 500-1000 fewer 3D objects than SOTA methods, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">BDNet</i> still achieves comparable performance.

Read full abstract

Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks — one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.

Read full abstract

Teacher Network Research Articles

Related Topics

Articles published on Teacher Network

Audio Representation Learning by Distilling Video as Privileged Information

KL-DNAS: Knowledge Distillation-Based Latency Aware-Differentiable Architecture Search for Video Motion Magnification.

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation.

Rich Action-semantic Consistent Knowledge for Early Action Prediction.

Balanced Class-Incremental 3D Object Classification and Retrieval

A lightweight residual network based on improved knowledge transfer and quantized distillation for cross-domain fault diagnosis of rolling bearings

Effects of ability grouping on students’ collaborative problem solving patterns: Evidence from lag sequence analysis and epistemic network analysis

The European List of Key Medicines for Medical Education: A Modified Delphi Study.

Enhancing deep feature representation in self-knowledge distillation via pyramid feature refinement

ABUS tumor segmentation via decouple contrastive knowledge distillation

Knowledge Distillation for Traversable Region Detection of LiDAR Scan in Off-Road Environments.

Topological persistence guided knowledge distillation for wearable sensor data

Advanced integrated segmentation approach for semi-supervised infrared ship target identification

A teacher–student deep learning strategy for extreme low resolution unsafe action recognition in construction projects

Global key knowledge distillation framework

Unsupervised domain adaptation for brain structure segmentation via mutual information maximization alignment

Degradation model and attention guided distillation approach for low resolution face recognition

Frame-Level Teacher-Student Learning With Data Privacy for EEG Emotion Recognition.

STKD: Distilling Knowledge From Synchronous Teaching for Efficient Model Compression.

한국 교육 ODA 교사 연수에 참여한 아제르바이잔 교사들의 경험 분석

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Teacher Network Research Articles

Related Topics

Articles published on Teacher Network

Audio Representation Learning by Distilling Video as Privileged Information

KL-DNAS: Knowledge Distillation-Based Latency Aware-Differentiable Architecture Search for Video Motion Magnification.

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation.

Rich Action-semantic Consistent Knowledge for Early Action Prediction.

Balanced Class-Incremental 3D Object Classification and Retrieval

A lightweight residual network based on improved knowledge transfer and quantized distillation for cross-domain fault diagnosis of rolling bearings

Effects of ability grouping on students’ collaborative problem solving patterns: Evidence from lag sequence analysis and epistemic network analysis

The European List of Key Medicines for Medical Education: A Modified Delphi Study.

Enhancing deep feature representation in self-knowledge distillation via pyramid feature refinement

ABUS tumor segmentation via decouple contrastive knowledge distillation

Knowledge Distillation for Traversable Region Detection of LiDAR Scan in Off-Road Environments.

Topological persistence guided knowledge distillation for wearable sensor data

Advanced integrated segmentation approach for semi-supervised infrared ship target identification

A teacher–student deep learning strategy for extreme low resolution unsafe action recognition in construction projects

Global key knowledge distillation framework

Unsupervised domain adaptation for brain structure segmentation via mutual information maximization alignment

Degradation model and attention guided distillation approach for low resolution face recognition

Frame-Level Teacher-Student Learning With Data Privacy for EEG Emotion Recognition.

STKD: Distilling Knowledge From Synchronous Teaching for Efficient Model Compression.

한국 교육 ODA 교사 연수에 참여한 아제르바이잔 교사들의 경험 분석