Abstract

In this study, we propose a method for constructing cell sample networks from gene expression profiles, and a structural entropy minimisation principle for detecting natural structure of networks and for identifying cancer cell subtypes. Our method establishes a three-dimensional gene map of cancer cell types and subtypes. The identified subtypes are defined by a unique gene expression pattern, and a three-dimensional gene map is established by defining the unique gene expression pattern for each identified subtype for cancers, including acute leukaemia, lymphoma, multi-tissue, lung cancer and healthy tissue. Our three-dimensional gene map demonstrates that a true tumour type may be divided into subtypes, each defined by a unique gene expression pattern. Clinical data analyses demonstrate that most cell samples of an identified subtype share similar survival times, survival indicators and International Prognostic Index (IPI) scores and indicate that distinct subtypes identified by our algorithms exhibit different overall survival times, survival ratios and IPI scores. Our three-dimensional gene map establishes a high-definition, one-to-one map between the biologically and medically meaningful tumour subtypes and the gene expression patterns, and identifies remarkable cells that form singleton submodules.

Highlights

  • One of the challenges of cancer treatment is targeting specific therapies to pathogenetically distinct tumour types to maximise treatment efficacy and minimise toxicity

  • For the diffuse large B-cell lymphoma (DLBCL), our results demonstrate that most of the cell samples within a module or submodule identified by our algorithms share similar survival times, survival indicators and International Prognostic Index (IPI) scores, and indicate that the distinct modules or submodules identified using our structural entropy minimisation algorithms noticeably differ in overall survival times, survival ratios and IPI scores, and that distinct modules identified by the modularity maximisation and description length minimisation algorithms exhibit undistinguishable survival times, survival ratios and IPI scores

  • We propose a method of identifying the high-dimensional structural entropy of graphs for the construction of networks from gene expression profiles and we propose the construction of heuristic algorithms K to detect the natural K-dimensional structure of networks by minimising the K-dimensional structural entropy, or the non-determinism of the K-dimensional structure of the networks

Read more

Summary

C10 C11 C12 C13 C14 C15 C16 C17 C18 Weighted average

Our algorithm for choosing k is an approximated realisation of the general principle that one-dimensional structural entropy minimisation is a correct principle for networking of unstructured data It works very well for the cell sample graph construction in the present paper. More optimal cell sample graphs may be constructed from the gene expression profiles, and these graphs may allow our algorithms 2 and 3 provide improved cancer classification We believe this is still a grand challenge for future computer science. The two- and three-dimensional gene maps developed by the algorithms 2 and 3 identify biologically and medically meaningful subtypes of tumours such that each subtype is defined by a unique gene expression pattern. Our three-dimensional gene map indicates that the gene expression profiles do not help to resolve this challenge

Conclusions
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call