Abstract

We describe our experiences in using an open domain question answering model (Chen et al., 2017) to evaluate an out-of-domain QA task of assisting in analyzing privacy policies of companies. Specifically, Relevant CI Parameters Extractor (RECIPE) seeks to answer questions posed by the theory of contextual integrity (CI) regarding the information flows described in the privacy statements. These questions have a simple syntactic structure and the answers are factoids or descriptive in nature. The model achieved an F1 score of 72.33, but we noticed that combining the results of this model with a neural dependency parser based approach yields a significantly higher F1 score of 92.35 compared to manual annotations. This indicates that future work which in-corporates signals from parsing like NLP tasks more explicitly can generalize better on out-of-domain tasks.

Highlights

  • Open domain question answering approaches offer a promising glimpse into a future in which machines are able to perform sophisticated cognitive tasks on behalf of a human

  • We present the options of the contextual integrity (CI) parameters as extracted by the dependency parsers to the annotators, who validate them as explained in the previous section, without having an option to modify the text for the annotation

  • In this paper we present our work towards designing a Relevant CI Parameters Extractor (RECIPE) that leverages an open domain QA model on Privacy Policies to answer questions posed by CI

Read more

Summary

Introduction

Open domain question answering approaches offer a promising glimpse into a future in which machines are able to perform sophisticated cognitive tasks on behalf of a human. The distribution of the interrogative words used, lexical and syntactic variations, reasoning across multiple sentences and ambiguous statements in such Wikipedia datasets (Ryu et al, 2014; Rajpurkar et al, 2016) results in a robust QA model Motivated by their success, we set to apply one such model (Chen et al, 2017) to evaluate an outof-domain QA task to assist in analysis and understanding of privacy policies. Identifying paragraphs that mention sensitive information in privacy policies only takes us halfway because we still need to understand who collects that information, who receives it, and under what conditions the collection happens To support this finer-grained analysis, we present a case for using an machine comprehension model for answering questions posed by the theory of contextual integrity (CI) (Nissenbaum, 2010). The combination of the two approaches yielded an overall F1 score of 92.35 against the baseline of six manually annotated privacy policies

Contextual Integrity Primer
Related work
Dependency parsing
Results and Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.