Occurrence Matrix Research Articles

Abstract INTRODUCTION Neuro-oncologic conditions have dismal outcomes, ineffective treatments, poor access to clinical trials, and variability in care. Clinical trials do not capture a patient’s complete journey and are restricted to select populations. ‘Real-world-evidence’ (RWE) attempts to inform point of care decisions through routine collection of data with a clinical-trial-like rigor. RWE complements existing knowledge through broad patient participation, collection throughout disease course, and creation of large multidimensional datasets “knowledge network of disease” 1,2. RWE implementation is hindered by unstructured data, uncertainty of relevant features, and semantic heterogeneity. Clinical attributes were selected from trial inclusion criteria and prioritized for structuring in clinic notes for abstraction. METHOD We queried Clinicaltrials.gov from 1/1/2018-12/31/2018, refined to North America, recruiting, interventional, and adult. Meningioma, pituitary, glioblastoma, astrocytoma, oligodendroglioma, and ependymoma were chosen based on incidence3. Lymphoma and nerve sheath tumors were omitted. “Brain tumor” and “glioma” were added. ‘K-nearest-neighbor’ tokenization parsed inclusion criteria4. Document term matrix (n-gram) converted text to vectors5. A generative probabilistic model using ‘Latent Dirichlet Allocation’ plotted words into 10 clusters6. Hierarchal clustering was used to compare histology with terms. RESULTS 401 trials parsed into 3676 statements and 4008 keywords. 10 clusters of terms were similarly distributed amongst histologies, suggesting generalizability across tumor types. Cluster revealed 8 categories: 1) Time: enrollment; 2) Performance status: KPS; 3) Testing: mutations, upper limit of normal, routine hematologic laboratory assays; 4) Imaging: extent of surgery; 5) Pregnancy/childbearing; 6) Tumor grade; 7) Treatment history: recurrence, chemotherapy, radiation, time; 8) Informed consent CONCLUSIONS Dissecting the compendium of clinical trials using machine learning can identify general parameters for trial enrollment to guide RWE clinical collection. Using practical definitions of the most germane trial data, specific information can be sought after and defined to improve research quality, maximize research yields and improve patient care whilst minimizing wasted research and clinical endeavors.

Read full abstract

PurposeText mining is growing in importance proportionate to the growth of unstructured data and its applications are increasing day by day from knowledge management to social media analysis. Mapping skillset of a candidate and requirements of job profile is crucial for conducting new recruitment as well as for performing internal task allocation in the organization. The automation in the process of selecting the candidates is essential to avoid bias or subjectivity, which may occur while shuffling through thousands of resumes and other informative documents. The system takes skillset in the form of documents to build the semantic space and then takes appraisals or resumes as input and suggests the persons appropriate to complete a task or job position and employees needing additional training. The purpose of this study is to extend the term-document matrix and achieve refined clusters to produce an improved recommendation. The study also focuses on achieving consistency in cluster quality in spite of increasing size of data set, to solve scalability issues.Design/methodology/approachIn this study, a synset-based document matrix construction method is proposed where semantically similar terms are grouped to reduce the dimension curse. An automated Task Recommendation System is proposed comprising synset-based feature extraction, iterative semantic clustering and mapping based on semantic similarity.FindingsThe first step in knowledge extraction from the unstructured textual data is converting it into structured form either as Term frequency–Inverse document frequency (TF-IDF) matrix or synset-based TF-IDF. Once in structured form, a range of mining algorithms from classification to clustering can be applied. The algorithm gives a better feature vector representation and improved cluster quality. The synset-based grouping and feature extraction for resume data optimizes the candidate selection process by reducing entropy and error and by improving precision and scalability.Research limitations/implicationsThe productivity of any organization gets enhanced by assigning tasks to employees with a right set of skills. Efficient recruitment and task allocation can not only improve productivity but also cater to satisfy employee aspiration and identifying training requirements.Practical implicationsIndustries can use the approach to support different processes related to human resource management such as promotions, recruitment and training and, thus, manage the talent pool.Social implicationsThe task recommender system creates knowledge by following the steps of the knowledge management cycle and this methodology can be adopted in other similar knowledge management applications.Originality/valueThe efficacy of the proposed approach and its enhancement is validated by carrying out experiments on the benchmarked dataset of resumes. The results are compared with existing techniques and show refined clusters. That is Absolute error is reduced by 30 per cent, precision is increased by 20 per cent and dimensions are lowered by 60 per cent than existing technique. Also, the proposed approach solves issue of scalability by producing improved recommendation for 1,000 resumes with reduced entropy.

Read full abstract

Occurrence Matrix Research Articles

Related Topics

Articles published on Occurrence Matrix

From Feature Engineering and Topics Models to Enhanced Prediction Rates in Phishing Detection

Marathi Document: Similarity Measurement using Semantics-based Dimension Reduction Technique

A Goal Programming Model for BWM

Lemon Leaf Disease Detection and Classification using SVM and CNN

INNV-15. CLINICAL DATA THAT MATTERS: A DISTILLATION OF NEURO-ONCOLOGY CLINICAL TRIAL INCLUSION CRITERIA USING MACHINE LEARNING

Relative Spectral Difference Occurrence Matrix: A Metrological Spectral-Spatial Feature for Hyperspectral Texture Analysis

Fake news detection within online social media using supervised artificial intelligence algorithms

Handheld Device Based on Image Processing Technique for Detecting Multiple Diseases of Apple Leaves

EXTRACTION OF FETAL FEATURES FROM B MODE ULTRASONOGRAMS FOR EFFICIENT DIAGNOSIS OF DOWN SYNDROME IN FIRST AND SECOND TRIMESTER

Human Behavior Understanding in Big Multimedia Data Using CNN based Facial Expression Recognition

Using Text Mining and Data Mining Techniques for Applied Learning Assessment

An Automatic Diagnosis Scheme for Gliomas Using Gray Level Co- occurrence Matrix, Histogram of Oriented Gradient,Discrete Wavelet Transformation, Intensity-based Features andHierarchical PSO-SVM

The analysis of source code plagiarism in basic programming course

Task recommender system using semantic clustering to identify the right personnel

Development of Sindhi text corpus

Perception d’espèces agroforestières et de leurs services écosystémiques par trois groupes ethniques du bassin versant de Boura, zone soudanienne du Burkina Faso

Detecting and mapping Gonipterus scutellatus induced vegetation defoliation using WorldView-2 pan-sharpened image texture combinations and an artificial neural network

Alternate Low-Rank Matrix Approximation in Latent Semantic Analysis

Applying Text-mining Techniques to Global Supply Chain Region Selection: Considering Regional Differences

DeepStar: Detecting Starring Characters in Movies

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Occurrence Matrix Research Articles

Related Topics

Articles published on Occurrence Matrix

From Feature Engineering and Topics Models to Enhanced Prediction Rates in Phishing Detection

Marathi Document: Similarity Measurement using Semantics-based Dimension Reduction Technique

A Goal Programming Model for BWM

Lemon Leaf Disease Detection and Classification using SVM and CNN

INNV-15. CLINICAL DATA THAT MATTERS: A DISTILLATION OF NEURO-ONCOLOGY CLINICAL TRIAL INCLUSION CRITERIA USING MACHINE LEARNING

Relative Spectral Difference Occurrence Matrix: A Metrological Spectral-Spatial Feature for Hyperspectral Texture Analysis

Fake news detection within online social media using supervised artificial intelligence algorithms

Handheld Device Based on Image Processing Technique for Detecting Multiple Diseases of Apple Leaves

EXTRACTION OF FETAL FEATURES FROM B MODE ULTRASONOGRAMS FOR EFFICIENT DIAGNOSIS OF DOWN SYNDROME IN FIRST AND SECOND TRIMESTER

Human Behavior Understanding in Big Multimedia Data Using CNN based Facial Expression Recognition

Using Text Mining and Data Mining Techniques for Applied Learning Assessment

An Automatic Diagnosis Scheme for Gliomas Using Gray Level Co- occurrence Matrix, Histogram of Oriented Gradient,Discrete Wavelet Transformation, Intensity-based Features andHierarchical PSO-SVM

The analysis of source code plagiarism in basic programming course

Task recommender system using semantic clustering to identify the right personnel

Development of Sindhi text corpus

Perception d’espèces agroforestières et de leurs services écosystémiques par trois groupes ethniques du bassin versant de Boura, zone soudanienne du Burkina Faso

Detecting and mapping Gonipterus scutellatus induced vegetation defoliation using WorldView-2 pan-sharpened image texture combinations and an artificial neural network

Alternate Low-Rank Matrix Approximation in Latent Semantic Analysis

Applying Text-mining Techniques to Global Supply Chain Region Selection: Considering Regional Differences

DeepStar: Detecting Starring Characters in Movies