Abstract
Currently, open data and data sets are emerging in human activity recognition (HAR) due to their importance in different application areas such as improving people's lives, enabling informed care decisions, real-world problem solutions, and strategies for choosing the best HAR approaches. There are challenges associated with curating and sharing open data and data sets due to the absence of metadata and complete descriptions of the shared data. By properly curating data sets it will be easier to recognise, obtain and reuse to help make progress in HAR research. In this paper, we propose a conceptual framework for understanding the open data set lifecycle as consisting of four phases of construction, sharing, finding, and using. Similarly, open issues and challenges are explored related to HAR data sets from the published literature. On this basis, an approach is presented to automatically extract metadata through web scraping of the HAR data sets and then perform a natural language processing (NLP) pipeline to detect the metadata of data sets. As a result of metadata retrieval, we show how comparisons can be performed under different scenarios which can help evaluate data set quality and identify areas for improvement in data set curation. This research work will assist the HAR research community in better understanding the open data set lifecycle and how data set quality can be improved.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.