Abstract

Among the most difficult computer vision tasks is one of detecting object's action. Solving that problem, it is needed to be aware of the position of the key points of a particular type of an object. Information about key points position uses to management decision making in technical systems. It is also being complicated task with the fact that training models able to detect the key points require a significant amount of complexly organized data. This paper focuses on finding a solution to the problem of detecting the position of biological object key points. That information is useful in terms of object's actions classification as well as for tracking them. Due to the lack of data for training, a method for obtaining additional data for training is suggested (data augmentation), also various types of backbone models are tested within the R-CNN networks on differently augmented data, with different optimizers, learning rate, number of training epochs and batches. Achieved accuracy on the test sample is more than 90%. The use of backbone models of the ResNet family allowed to achieve greater accuracy of work, which was more than 93%, while the use of reference models from the MobileNet family with an accuracy of about 90% allowed to achieve a processing speed of each frame three times higher (on average) than while using backbone models of the ResNet family.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call