Privacy-aware document retrieval with two-level inverted indexing

  • Abstract
  • Highlights & Summary
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Previous work on privacy-aware ranking has addressed the minimization of information leakage when scoring top k documents, and has not studied on how to retrieve these top documents and their features for ranking. This paper proposes a privacy-aware document retrieval scheme with a two-level inverted index structure. In this scheme, posting records are grouped with bucket tags and runtime query processing produces query-specific tags in order to gather encoded features of matched documents with a privacy protection during index traversal. To thwart leakage-abuse attacks, our design minimizes the chance that a server processes unauthorized queries or identifies document sharing across posting lists through index inspection or across-query association. This paper presents the evaluation and analytic results of the proposed scheme to demonstrate the tradeoffs in its design considerations for privacy, efficiency, and relevance.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.3837/tiis.2021.08.007
A Lightweight and Privacy-Preserving Answer Collection Scheme for Mobile Crowdsourcing
  • Aug 31, 2021
  • KSII Transactions on Internet and Information Systems
  • Yingling Dai + 4 more

Mobile Crowdsourcing (MCS) has become an emerging paradigm evolved from crowdsourcing by employing advanced features of mobile devices such as smartphones to perform more complicated, especially spatial tasks. One of the key procedures in MCS is to collect answers from mobile users (workers), which may face several security issues. First, authentication is required to ensure that answers are from authorized workers. In addition, MCS tasks are usually location-dependent, so the collected answers could disclose workers'' location privacy, which may discourage workers to participate in the tasks. Finally, the overhead occurred by authentication and privacy protection should be minimized since mobile devices are resource-constrained. Considering all the above concerns, in this paper, we propose a lightweight and privacy-preserving answer collection scheme for MCS. In the proposed scheme, we achieve anonymous authentication based on traceable ring signature, which provides authentication, anonymity, as well as traceability by enabling malicious workers tracing. In order to balance user location privacy and data availability, we propose a new concept named current location privacy, which means the location of the worker cannot be disclosed to anyone until a specified time. Since the leakage of current location will seriously threaten workers'' personal safety, causing such as absence or presence disclosure attacks, it is necessary to pay attention to the current location privacy of workers in MCS. We encrypt the collected answers based on timed-release encryption, ensuring the secure transmission and high availability of data, as well as preserving the current location privacy of workers. Finally, we analyze the security and performance of the proposed scheme. The experimental results show that the computation costs of a worker depend on the number of ring signature members, which indicates the flexibility for a worker to choose an appropriate size of the group under considerations of privacy and efficiency.

  • Research Article
  • Cite Count Icon 1
  • 10.3389/frcmn.2025.1600750
Privacy considerations for LLMs and other AI models: an input and output privacy approach
  • Sep 10, 2025
  • Frontiers in Communications and Networks
  • Zixin Nie + 2 more

The framework of Input and Output Privacy aids in conceptualization of data privacy protections, providing considerations for situations where multiple parties are collaborating in a compute system (Input Privacy) as well as considerations when releasing data from a compute process (Output Privacy). Similar frameworks for conceptualization of privacy protections at a systems design level are lacking within the Artificial Intelligence space, which can lead to mischaracterizations and incorrect implementations of privacy protections. In this paper, we apply the Input and Output Privacy framework to Artificial Intelligence (AI) systems, establishing parallels between traditional data systems and newer AI systems to help privacy professionals and AI developers and deployers conceptualize and determine the places in those systems where privacy protections have the greatest effect. We discuss why the Input and Output Privacy framework is useful when evaluating privacy protections for AI systems, examine the similarities and differences of Input and Output privacy between traditional data systems and AI systems, and provide considerations on how to protect Input and Output Privacy for systems utilizing AI models. This framework offers developers and deployers of AI systems common ground for conceptualizing where and how privacy protections can be applied in their systems and for minimizing risk of misaligned implementations of privacy protection.

  • Conference Article
  • Cite Count Icon 11
  • 10.1145/2145204.2145262
Contents and contexts
  • Feb 11, 2012
  • Natalya N Bazarova

Social network sites (SNSs) provide new forms of communication, in which people routinely share personal information with a large audience. The goal of this research is to examine how a public context in which disclosures are revealed influences receivers' impressions of disclosure and a discloser on SNSs. The results of the original study reported in this paper indicate that publicly shared disclosures were perceived as less intimate and less appropriate than privately shared disclosures on Facebook, and perceptions of disclosure appropriateness mediated the effects of public/private contexts on social attraction for a discloser. The results inform research on social outcomes associated with SNS's use, as well as design considerations for privacy- and disclosure-related behaviors in social media.

  • Conference Article
  • 10.2991/icmemtc-16.2016.323
The Use of Persona in Recommendation System and Privacy Protection
  • Jan 1, 2016
  • Suduo Li + 3 more

Aimed at the contradiction between personalized recommendation and privacy protection, this paper puts forward the basic idea of persona, a digitalized user model.The method uses browser to analyze user's access behavior, gets a comprehensive and accurate user model, so as to help realize the personalized recommendation.In consideration of privacy, let user model saved on the client side, and user can decide to what extent browser will offer his/her own user characteristics to the target website, so user's privacy is fully protected.Thus it solved the contradiction between the personalized recommendation and privacy protection successfully.

  • PDF Download Icon
  • Book Chapter
  • 10.1007/978-3-031-51063-2_11
Privacy Considerations in Archival Practice and Research
  • Jan 1, 2024
  • Katrina Windon + 1 more

Privacy considerations are woven throughout archival practice, from the acquisition and stewardship of archival collections to the generation and retention of patron request records. Drawing on scholarship from archival theorists and practitioners, as well as recent case studies and their own professional experience, the authors present a broad overview of the ethics and practice of privacy throughout the archival life cycle. At each stage, archivists seek a balance between sensitivity to the rights and well-being of creators and subjects and responsibility to researchers and the access mission. For archivists, privacy protections are not about secrecy or exclusivity, but about an ethics of care. Decisions made by archivists related to privacy have broad societal implications related to the understanding of history, the accountability of those in power, the availability of information, and the agency of creators and communities. We approach privacy from two complementary perspectives in archival practice—that of a collections manager and processing archivist working with creators, donors, and their archival collections to determine what privacy considerations may need to be addressed, and that of a reference and instruction archivist, working with students and researchers whose use of archival collections generates data that has its own privacy concerns.

  • Research Article
  • 10.51594/csitrj.v6i5.1933
Privacy and Data Protection: Balancing Security and User Rights
  • Jun 4, 2025
  • Computer Science & IT Research Journal
  • Oluomachi Eunice Ejiofor + 3 more

In an era of rapid technological advancement and increasing digital interconnectedness, the balance between security and user rights has emerged as a critical issue. Privacy and data protection are at the forefront of this discourse, driving the need for robust frameworks that safeguard individual privacy while ensuring the security of digital systems. This paper explores the evolving landscape of privacy and data protection, emphasizing the challenges and strategies for achieving a balance between security imperatives and user rights. Data protection regulations, such as the General Data Protection Regulation (GDPR) in the European Union and the California Consumer Privacy Act (CCPA) in the United States, have set stringent standards for how organizations collect, store, and use personal data. These regulations aim to empower individuals with greater control over their personal information and mandate rigorous compliance measures for businesses. The implementation of such regulations has highlighted the necessity of adopting privacy-by-design principles, where privacy considerations are integrated into the development of technologies and business practices from the outset. Balancing security and user rights requires navigating complex trade-offs. Enhanced security measures, including encryption and stringent access controls, are essential for protecting data against breaches and unauthorized access. However, these measures must be carefully designed to avoid infringing on user rights and freedoms. Transparency and accountability are crucial in building trust between users and organizations. By ensuring clear communication about data collection practices and providing mechanisms for users to exercise their rights, organizations can foster a culture of trust and compliance. Technological innovations, such as artificial intelligence and blockchain, offer promising avenues for enhancing both security and privacy. AI-driven analytics can help detect and mitigate security threats while respecting user privacy through techniques like differential privacy. Blockchain technology can provide decentralized and transparent data management solutions that enhance security and user control over personal data. In conclusion, the interplay between privacy and data protection in the digital age demands a nuanced approach that prioritizes both security and user rights. By adhering to regulatory frameworks, embracing privacy-by-design, and leveraging advanced technologies, organizations can navigate this complex landscape and achieve a sustainable balance that benefits both individuals and society at large. Keywords: Privacy, Data Protection, Balancing, Security, User Rights.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/iri.2013.6642501
Filter- and wrapper-based feature selection for predicting user interaction with Twitter bots
  • Aug 1, 2013
  • Randall Wald + 2 more

High dimensionality (the presence of too many features) is a problem which plagues many datasets, including mining from personality profiles. Feature selection can be used to reduce the number of features, and many strategies have been proposed to help select the most important features from a larger group. Feature rankers will produce a metric for each feature and return the best for a given subset size, while filter-based subset evaluation will perform statistical analysis on whole subsets and wrapper-based subset selection will use classification models with chosen features to decide which are most important for model-building. While all three approaches have been discussed in the literature, relatively little work compares all three with one another directly. In the present study, we do precisely this, considering feature ranking, filter-based subset evaluation, and wrapper-based subset selection (along with no feature ranking) on two datasets based on predicting interaction with bots on Twitter. For the two subset-based techniques, we consider two search techniques (Best First and Greedy Stepwise) to build the subsets, while we use one feature ranker (ROC) chosen for its excellent performance in previous works. Six learners are used to build models with the selected features. We find that feature ranking consistently performs well, giving the best results for four of the six learners on both datasets. In addition, all of the techniques other than feature ranking perform worse than no feature selection for four of six learners. This leads us to recommend the use of feature ranking over more complex subset evaluation techniques.

  • Research Article
  • Cite Count Icon 9
  • 10.56553/popets-2022-0115
“You offer privacy like you offer tea”: Investigating Mechanisms for Improving Guest Privacy in IoT-Equipped Households
  • Oct 1, 2022
  • Proceedings on Privacy Enhancing Technologies
  • Karola Marky + 4 more

IoT devices are becoming more common and prevalent in private households. Since guests can be present in IoT-equipped households, IoT devices can pose considerable privacy risks to them. In this paper, we present an in-depth evaluation of privacy protection for guests considering the perspectives of hosts and guests. First, we interviewed 21 IoT device owners about four classes of mechanisms obtained from the literature and social aspects. Second, we conducted an online survey (N=264) that investigates the perspective of guests in IoT-equipped households. From our results, we learn that protection mechanisms should not introduce privacy threats and require low resources. Further, hosts should keep control over their devices and the aesthetics of their living spaces. Guests, however, value feedback about the status of privacy protection which can interfere with aesthetics. Privacy protection should rather foster collaboration and not impact the visit of the guest too severely. We use our results to identify a design space for guest privacy protection in IoT-equipped households.

  • Research Article
  • 10.4467/26581264arc.23.011.17871
Parsing privacy for archivists
  • Oct 24, 2023
  • Archeion
  • Trudy Huskamp Peterson

Protection of privacy is a key issue in determining the extent to which archival materials are to be made accessible to the public. But what is informational privacy; i.e., what are the elements of information found in any type of document or database that must be withheld to avoid intruding on the privacy of an individual? This essay first examines post-World War II international statements that reference privacy. Then it turns to statements referring to privacy issued by the International Council on Archives (ICA), the worldwide professional organization that represents the archival profession to UNESCO. Third is a brief look at several 21st century academic considerations of privacy, one each by a lawyer, a philosopher, and an historian. Finally, it outlines some of the contextual elements that help archivists manage sensitive materials, even without a final definition of informational privacy.

  • Research Article
  • Cite Count Icon 2
  • 10.58346/jowua.2024.i3.033
Assessing the Recreational Fishers and their Catches based on Social Media Platforms: Privacy and Ethical Data Analysis Considerations
  • Sep 30, 2024
  • Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications
  • Mingxin Xue

Social Media (SM) can offer insights to track recreational Fishing (Rf); Many Limitations Hinder Its Practical Application. The proportion and characteristics of RFs that share their catches remain unidentified. The objective was to enhance the surveillance abilities of RF by utilizing SM information. This study focuses on the sharing economy information transmission, network optimization of SM platforms, and privacy protection issues during data transmission. The study starts with the data transmission characteristics in SM platforms, analyzes the data ethics in SM information transmission from the perspectives of natural and economic data attributes, and then focuses on the privacy protection principles in network information transmission. Then, a privacy protection scheme based on a locally sensitive hash algorithm is constructed, and finally, an information transmission scheme and method optimization based on a network optimization module are proposed. The research gathered data using physical (face-to-face) surveys and digital (email) surveys to define marine RF that post catches on online platforms (“sharers”), together with additional demographic and fishing data. A comparative analysis was conducted on the computational convergence and accuracy of different privacy protection methods, and the optimal rule framework for data privacy protection was discussed. The observation results show that the efficiency of information transmission using the Minhash-SSNR method is slightly inferior to that using the OSNR-SSNR method. Minhash technology is based on a similarity comparison of dimensionality-reduced data. This leads to data causing the initially highly similar two sets of lists to be misjudged as not having enough similarity, thereby reducing some potential information dissemination opportunities and affecting the efficacy of information transmission. It can be seen that the Minhash-SSNR strategy can effectively send information to nodes with high similarity, preventing excessive information duplication within the system. Although the Minhash-SSNR strategy has a certain degree of decline in information transmission efficiency, it accounts for only about 5% compared to the OSNR-SSNR strategy, ensuring the essential operational stability of SM opportunity networks without significant impact. With few learning and training times, the network information transmission optimization module proposed in this study quickly achieved a lower exponential error. As the number of training sessions gradually increases, the exponential error of the network information transmission optimization module in this study is small, and the prediction accuracy of the method is high. RFs who captured a prize, iconic, or symbolic species tended to discuss their catches more. This research signifies significant progress in incorporating SM information to oversee RFs.

  • Research Article
  • 10.1080/01612840.2024.2414748
Determining the Acceptability of Targeted Apps for High-Risk Alcohol Consumption in Nurses: A Qualitative Study
  • Oct 10, 2024
  • Issues in Mental Health Nursing
  • Adam Searby + 1 more

Aim To determine the acceptability of targeted apps and provide recommendations for the implementation of and app addressing high risk alcohol use to nurses. Design A qualitative descriptive study design, using the Behavioural Change Wheel implementation framework. Methods Semi-structured interviews with 42 Australian nurses were subject to structural coding using the Capability, Opportunity, and Motivation (COM-B) model linked to the Behaviour Change Wheel. Qualitative data has been reported using the COREQ framework. Results Most participants agreed that targeted apps would appeal to nurses, provided specific design considerations were included. These considerations related to privacy and confidentiality, strategies to target the app to nurses across wide age and experience ranges and identified the need for a considered campaign to both launch the app and position it with existing interventions for high-risk alcohol use. Conclusions Our findings indicate that a targeted app to reduce high-risk alcohol consumption could be acceptable to nurses, however the needs to include specific components suitable for nurses. We recommend further research into specific components of a targeted app, leading to a co-design process where nurses can determine app components and function. Summary of relevance High-risk alcohol consumption has been shown to be an issue amongst nurses. Targeted apps have been shown to have an effect in addressing high-risk alcohol consumption among specific groups. However, consideration for privacy of data provided to the app must be considered, especially given the link between disciplinary action, loss of role identity, and nurse suicide. This paper indicates that nurses would accept a targeted app, subject to specific design considerations, particularly related to confidentiality.

  • Research Article
  • Cite Count Icon 3
  • 10.1002/pra2.244
Privacy considerations when predicting mental health using social media
  • Oct 1, 2020
  • Proceedings of the Association for Information Science and Technology
  • Tian Wang + 1 more

ABSTRACTIn recent years the number of individuals struggling with mental illness has increased, and traditional mental health services are now considered insufficient under the current circumstances which has prompted researchers to develop new approaches for mental healthcare. Social media usage is growing, and it is been utilized to help provide additional insight on mental health by using the information shared by individuals, as well as data taken from their social media activity. While this approach may provide a unique and effective perspective for mental health services, it is critical that privacy risks and protections are considered in the process. Social media services collect, process, and stores a substantial amount of information about its users and how that information is shared as well as what type of predictions are made may pose serious privacy concerns. This study aims to understand how privacy is addressed and emphasized during the process of using social media data for mental healthcare by constructing a systematic review on previous scholarly papers related to the topic. Solove's taxonomy of privacy is used to evaluate these publications privacy considerations and to demonstrate the privacy risks that may arise when social media data is used for mental health.

  • Research Article
  • 10.26481/marble.2018.v4.646
The End of Privacy? Paradoxes and Dilemmas of Internet Use and Online Surveillance
  • Oct 20, 2018
  • MaRBLe
  • Carolyn Gaumet

With the Internet growing in importance in our daily lives, concerns about privacy and data protection have emerged. While people worry about where they data may end up, they continue making themselves openly transparent by sharing information about themselves and their lives online. This study aims to understand the paradoxes between privacy considerations – mainly, the wish to keep individual data private and secure – and the actions that people undertake in reality. More specifically, it focuses on three paradoxes and dilemmas of privacy: age, perceived usefulness, and rewards. These will be studied by analyzing the results of a survey, in which respondents from the EU, North America and East Asia were asked about their online habits and their opinions on various security issues and privacy measures. The analysis ultimately aims to further the understanding of privacy paradoxes, and what hinders people from protecting their data sufficiently.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-319-49655-9_4
Proposal of a New Privacy Protection Scheme for the Data Subject on the International Cooperation Information Sharing Platform
  • Dec 1, 2016
  • Naonori Kato + 2 more

A novel project called iKaaS (intelligent Knowledge-as-a-Service) was adapted as a Strategic Information and Communications R&D Promotion Programme (SCOPE), one of the projects funded by Ministry of Internal Affairs and Communications. This project aims an advanced knowledge-intensive platform that enables to provide and distribute the relevant information under strict consideration of privacy. This information distribution includes a cross-border one between EU and Japan, where privacy protection of the data subject is a major issue. To settle the privacy issues inside the project, DPEC (Data Protection and Ethical Community) was established as a governing organization for privacy. In this paper, we consider issues on the cross-border data distribution from the viewpoint of the legal system comparison between EU and Japan. As a result of the consideration, we introduce the governance framework of DPEC. Moreover, we clarify the issues to be discussed in the future cross-border data distribution and propose a privacy enhanced data protection scheme.

  • Research Article
  • Cite Count Icon 1
  • 10.14257/ajmahs.2016.03.30
Smart Grid Privacy Protection Measures According to the Change of IT Paradigm
  • Mar 31, 2016
  • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
  • Dong-Hyeok Lee + 1 more

In recent years, IT paradigm is changing and the smart grid environment has evolved. Smart grid system brings convenience, at the same time, there is also always the threat of privacy. Privacy threats Received (January 15, 2016), Review Result (January 29, 2016) Accepted (February 5, 2016), Published (March 31, 2016) 632-94 Jeju National University Elementary Education Research Institute, Jeju National University, 61 Iljudong-ro, Jeju-si, Jeju-do, Korea email: bonfard@jejunu.ac.kr 632-94 Major in Computer Education, Faculty of Science Education, Graduate School, Jeju National Univ., 61 Iljudong-ro, Jeju-si, Jeju-do, Korea (Corresponding Author) Department of Computer Education, Teachers College, Jeju National University, 61 Iljudong-ro, Jeju-si, Jeju-do, Korea email: namjepark@jejunu.ac.kr * 이 논문은 2013년도 정부(교육부)의 재원으로 한국연구재단의 기초연구사업 지원을 받아 수행된 것임(과제 번호:2013R1A1A4A01013587) Smart Grid Privacy Protection Measures According to the Change of IT Paradigm Copyright c 2016 HSST 82 must be considered in various ways. Therefore, this paper examines the privacy threats from emerging Smart grid environment in two terms of mobile and cloud. And it presents the preperation plan to threats. In this paper, the main requirements for each telecommunicational environment of smart grid were examined, precedent studies were analyzed, and the kinds of security service to meet these requirements and to offer high quality of security of information was analyzed. And then, based on the defined research framework, the types of abuse of personal information and the consideration for privacy and information protection and the methods to protect them were studied. Grounded on this research, more systematic study on the security technology of smart grid was to be conducted.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon
Setting-up Chat
Loading Interface