Abstract

The value of knowledge obtainable by analysing large quantities of data is widely acknowledged. However, so-called primary or raw data may not always be available for knowledge discovery for several reasons. First, cooperating institutions that are interested in sharing knowledge may not be willing (or allowed) to disclose their primary data. Second, data in the form of streams are only temporarily available for processing. If stored at all, stream data are maintained in the form of synopses or derived, abstract representations of the original data. Finally, even for non-stream data, there are limits on the computation speed to be achieved -- such limits are set by hardware and firmware technologies. This problem can only be partially solved through parallelization and increased processing power. Ultimately, in many cases data must be summarized to be processed efficiently. In the light of these observations, we anticipate the need for defining and practising data mining without the luxury of primary data. To that end, we formally introduce the paradigm of Higher Order Mining as a form of data mining that is applied over non-primary, derived data or patterns. Although Higher Order Mining is a new paradigm, there are already research advances on knowledge discovery methods from patterns rather than data. We discuss them and organize them under the light of the new paradigm. We show that the HOM paradigm reveals further potential for knowledge discovery, including the delivery of rules and patterns with semantics that are closer to human intuition and are thus more appropriate for human inspection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.