Abstract
In the following article, a proprietary method of anonymisation of identifiable statistical data using context-free probabilistic grammar is proposed. The advantage of this method is that it is simple and thanks to this, the identifier is easy to retrieve after masking the identifiable data, e.g. when it is necessary to modify or update the micro-data. This can be done using public-key cryptography, i.e. encrypting some probabilistic context-free grammar with this method. In the case of public statistics, there is often a need to use an anonymised source value, for example when economic operators’ reports are verified by statistical officers. With appropriate information generated by context-free grammar, the verifier can easily identify an economic operator or a natural person. The idea of the anonymising algorithm used in the proposed method is presented by means of an example. According to the authors, the combination of the proposed method with asymmetric encryption of the definition of context-free grammar using public key infrastructure, makes it probable that its resistance to attacks will be quite high. This is because statistical methods that are used in the analysis of natural languages are not susceptible to attacks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.