How are PDF files published in the Scientific Community?

Supriya Adhatarao,Cedric Lauradoux

doi:10.1109/wifs53200.2021.9648374

Abstract

Authors are often not aware of hidden information and that they can contain more information than the actual content of the file. This work mainly focuses on how PDF files are published in the scientific community. We have analyzed a corpus of 555865 PDF files to show that direct and modified authoring process of PDF creations leads to the leakage of sensitive information on the researchers. Our analysis on the extraction of the metadata has shown that at least 23% of the PDF files in our dataset contains valuable information on the authoring process. We were even able to solve the co-authorship (multiple authors) problem by crossing the information of multiple PDF files using linear algebra. We believe that, PDF sanitization needs to be included in the scientific publication processes to avoid leakage of sensitive information. We have explored and suggested necessary strategies available for the safer distribution of scientific work by researchers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How are PDF files published in the Scientific Community?

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Using a Distributed Sensor Network to Educate Children About IoT Leakage of Sensitive Information
Alan Ibbett ... Yeslam Al-Saggaf
-
Alan Ibbett, et. al.Alan Ibbett ... Yeslam Al-Saggaf
21 Nov 2022
21 Nov 2022

Study on sensitive information leakage vulnerability modeling
Sung-Hwan Kim ... Nam-Uk Kim
Kybernetes | VOL. 44
Sung-Hwan Kim, et. al.Sung-Hwan Kim ... Nam-Uk Kim
12 Jan 2015
Kybernetes | VOL. 44

Enabling Simultaneous PDF File Access on Android Mobile Device
Aliza I Shaikh ... Madiya R Kazi
International Journal of Advanced Research in Science, Communication and Technology | VOL. -
Aliza I Shaikh, et. al. Aliza I Shaikh ... Madiya R Kazi
30 Mar 2024
International Journal of Advanced Research in Science, Communication and Technology | VOL. -

Boston Symphony Orchestra Archives (review)
Lisa Shiota
Notes | VOL. 72
Lisa ShiotaLisa Shiota
10 Feb 2016
Notes | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How are PDF files published in the Scientific Community?

Abstract

Talk to us

Similar Papers