Abstract

Nowadays, many industrial companies visually record workers’ activities for the purposes of streamlining their work processes. However, since untrimmed raw videos are hard to use, it is desired to automatically divide the videos into segments and recognize which kind of operation is performed on each segment. This task is called temporal video segmentation. We propose a method for achieving it, particularly targeting videos of manufacturing work with a specialized vehicle such as a hydraulic excavator. To make the performance of temporal video segmentation high, it is quite essential to extract good visual features from input videos. This can be hardly achieved by unsupervised methods, whereas supervised methods have another drawback that collecting a sufficient amount of training data is difficult due to its labor-intensiveness. To overcome these drawbacks, the proposed method employs a semi-supervised approach. We assume that a set of weakly-labeled videos whose frames only sparsely have a category label are given as input, where the labeled frames are used as training data to train a desirable feature extractor. Under this assumption, the proposed method first divides the input videos into short segments called primitive segments having the fixed length and then clusters them using visual features extracted by the above feature extractor. To achieve higher performance, we also use a hierarchical tree of the category labels and recursively perform the above process at each branch in the tree, where the tree is automatically built by the proposed method. In our experiments, we achieved a segmentation performance of 0.947 on the F-measure, even when only 1.25% of all the frames in the input videos are labeled.

Highlights

  • In today’s society, large scale manufacturing work using a specialized vehicle such as a bulldozer, road roller, aerial platform, and so on is daily performed by many industrial companies

  • Training of the feature extractor in the above process becomes more difficult when more diverse category labels are considered. This is because, a visual feature helpful for discriminating between two categories A and B is not necessarily consistent with that for discriminating between other two categories C and D. To solve this problem and further improve the performance of temporal video segmentation, we propose to use a hierarchical tree of category labels and recursively perform the above process at each branch in the tree, where the tree itself is automatically built by the proposed method

  • SEMI-SUPERVISED TEMPORAL SEGMENTATION OF MANUFACTURING WORK VIDEO we describe the basic approach employed in the proposed method, which regards the task shown in FIGURE 2 as the problem of semi-supervised clustering

Read more

Summary

Introduction

In today’s society, large scale manufacturing work using a specialized vehicle such as a bulldozer, road roller, aerial platform, and so on is daily performed by many industrial companies. Typical examples of the manufacturing work include civil construction using a hydraulic excavator. Industrial companies conducting such work often have a strong demand to record workers’ activity history in order to streamline their work process. With a hydraulic excavator, workers perform various operations such as digging, swinging, dumping, idling, and so on.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call