Abstract

Embeddings provide compact representations of signals to be used to perform inference in a wide variety of tasks. Random projections have been extensively used to preserve Euclidean distances or inner products of high dimensional signals into low dimensional representations. Different techniques based on hashing have been used in the past to embed set similarity metrics such as the Jaccard coefficient. In this paper we show that a class of random projections based on sparse matrices can be used to preserve the Jaccard coefficient between the supports of sparse signals. Our proposed construction can be therefore used in a variety of tasks in machine learning and multimedia signal processing where the overlap between signal supports is a relevant similarity metric. We also present an application in retrieval of similar text documents where SparseHash improves over MinHash.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.