Multimodal Compression Research Articles

Compression schemes for advanced data structures have become a central modern challenge. Information theory has traditionally dealt with conventional data such as text, images, or video. In contrast, most data available today is multitype and context-dependent. To meet this challenge, we have recently initiated a systematic study of advanced data structures such as unlabeled graphs [8]. In this paper, we continue this program by considering trees with statistically correlated vertex names. Trees come in many forms, but here we deal with binary plane trees (where order of subtrees matters) and their non-plane version (where order of subtrees doesn't matter). Furthermore, we assume that each name is generated by a known memoryless source (horizontal independence), but a symbol of a vertex name depends in a Markovian sense on the corresponding symbol of the parent vertex name (vertical Markovian dependency). Such a model is closely connected to models of phylogenetic trees. While in general the problem of multimodal compression and associated analysis can be extremely complicated, we find that in this natural setting, both the entropy analysis and optimal compression are analytically tractable. We evaluate the entropy for both types of trees. For the plane case, with or without vertex names, we find that a simple two-stage compression scheme is both efficient and optimal. We then present efficient and optimal compression algorithms for the more complicated non-plane case.

Read full abstract

Introduction We are drowning in data. What kinds of data? - Text. Images. Sound. Numeric. Genome data. Text: Every day vast amounts of textual data are generated. This ranges from private corporate data, personal information, public and private government documents and so on. Much of this data needs to be accessed by many users for many tasks. For example, a corporate call centre needs fast access to documents at a semi-concept level to answer user requests. Another example: large litigations can involve 2 million documents, 200,000 of which are relevant, much fewer significant, and a handful pivotal. Techniques are desperately needer to automate the first few steps of this winnowing. Images: There are video cameras everywhere, trying to protect our safety in car parks, public places, even some lifts. There are huge and ever growing still and video archives of all aspects of our modern world. Access and indexing this data is a huge research enterprise. Much indexing is done manually. Sound: Often in concert with video in multi-media recordings. But what did the Prime Minister say on the 1st of November about the Republic? Did he sound like he meant it? These are currently not easily answered queries except if carried out by an expert human investigator. These kind of queries will need to be commonplace to access sound data in humanly meaningful ways. Numeric: Our industries generate vast amounts of valuable numeric data. In the petroleum industry geologic knowledge must be integrated with data from wells: laboratory core analysis data and on-site well logs, with seismic data generated from controlled explosions and dispersed recording devices. Then there is GIS data collected from satellites and so on. In the service industry, the stock exchange generates large amounts of hard to analyse data vital to the wellbeing of Australian companies. Genome data: The human genome project is almost complete. Researchers are finding genes by a mix of laboratory work and computerised database searches (e.g. as reported in the Weekend Australian 30 October). This is just the first step, the next will be sequencing of a number of individuals, and of course there are currently over 100 whole genome sequencing projects on other species. Fast genome sequencing is just around the corner. We will soon be drowning in this kind of data also. Multimedia data: Includes all of audio, text, graphics, images, video, animation, music. More data! What Is The Real Problem? Manual extraction of information from any large corpus is time con-suming and expensive, requiring specialised experience in the material. Even worse, beyond a certain point it is incredibly boring, and hence error prone. Human intelligence is best suited to dealing with information, as distinct to data! A Solution The development of automated systems for information extraction, and for the synthesis of the extracted information into humanly useful information resources. To avoid drowning in the ever increasing flow of multi-modal electronic information available, automated tools are required to reduce the cognitive load on users. STEPS TOWARDS A SOLUTION The key step towards a solution is the notion of information compression, being the compression of data to yield an information rich(er) resource. This is distinct from data compression which is merely the efficient storage of data. Further, the information compression must work on multi-model complex data, exemplified by multimedia data. Some of the techniques for doing this kind of information compression exist in a scattered way in areas such as fuzzy systems, and image analysis. We have identified a nascent field, which we can coalesce in an intensive short workshop. The first Australia-Japan Joint Workshop on Applications of Soft and Intelligent Computing to Multimodal and Multimedia Information Compression Technologies was held at Murdoch University in Perth, Western Australia from the 29 March to 5 April 2000. This special issue contains selected papers from the workshop.

Read full abstract

Multimodal Compression Research Articles

Related Topics

Articles published on Multimodal Compression

Compressing Biosignal for Diagnosing Chronic Diseases

An efficient JPEG-2000 based multimodal compression scheme

Fast Hyperspectral Image Encoder Based on Supervised Multimodal Scheme

Lossless Compression of Binary Trees with Correlated Vertex Names.

An improved multimodal signal-image compression scheme with application to natural images and biomedical data

On Staying Grounded and Avoiding Quixotic Dead Ends.

Multimodal compression applied to biomedical data

Fast Encoding-Decoding of 3D Hyperspectral Images Using a Non-Supervised Multimodal Compression Scheme

Multimedia Information Compression Technologies

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multimodal Compression Research Articles

Related Topics

Articles published on Multimodal Compression

Compressing Biosignal for Diagnosing Chronic Diseases

An efficient JPEG-2000 based multimodal compression scheme

Fast Hyperspectral Image Encoder Based on Supervised Multimodal Scheme

Lossless Compression of Binary Trees with Correlated Vertex Names.

An improved multimodal signal-image compression scheme with application to natural images and biomedical data

On Staying Grounded and Avoiding Quixotic Dead Ends.

Multimodal compression applied to biomedical data

Fast Encoding-Decoding of 3D Hyperspectral Images Using a Non-Supervised Multimodal Compression Scheme

Multimedia Information Compression Technologies