Abstract
With massive high-dimensional data now common-place in research and industry, there is a strong and growing demand for more scalable computational techniques for data analysis and knowledge discovery. In this paper, we review scalable algorithms for learning statistical models on high-dimensional data. Especially, we introduce two techniques of lossless and lossy compressions. The first one is a method using grammar compression. Grammar compression is a lossless compression for texts and has been successfully applied to binary data matrices for scalable learning of statistical models. The second one is a method of lossy compressions named feature maps (FMs). Recently, quite a few number of FMs for kernel approximations have been proposed and have been used in practical applications. Those methods, of which we present a brief survey in this paper, open the door for large-scale analyses of massive and high-dimensional data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.