Abstract
Abstract Most cross-national human rights datasets rely on human coding to produce yearly, country-level indicators of state human rights practices. Hand-coding the documents that contain the information on which these scores are based is tedious and time-consuming, but has been viewed as necessary given the complexity and detail of the information contained in the text. However, advances in automated text analysis have the potential to streamline this process without sacrificing accuracy. In this research note, we take the first step in creating this streamlined process by employing a supervised machine learning automated coding method that extracts specific allegations of physical integrity rights violations from the original text of country reports on human rights. This method produces a dataset including 163,512 unique abuse allegations in 196 countries between 1999 and 2016. This dataset and method will assist researchers of physical integrity rights abuse because it will allow them to produce allegation-level human rights measures that have previously not existed and provide a jumping-off point for future projects aimed at using supervised machine learning to create global human rights metrics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.