Abstract

1. Introduction Pain and pain chronification are incompletely understood and unresolved medical problems that continue to have a high prevalence.14 It has been accepted that pain is a complex phenomenon.2,32,72 Contemporary methods of computational science51 can use complex clinical and experimental data to better understand the complexity of pain. Among data science techniques, machine learning is referred to as a set of methods (Fig. 1) that can automatically detect patterns in data and then use the uncovered patterns to predict or classify future data, to observe structures such as subgroups in the data, or to extract information from the data suitable to derive new knowledge.11,43 Together with (bio)statistics, artificial intelligence and machine learning aim at learning from data.Figure 1.: Overview and classification of machine learning methods, selected for their use in a pain research context. The figure structures machine learning for its main uses comprising (1) classification tasks used for example to obtain a clinical diagnosis, (2) data structure detection including the identification of clusters, and (3) knowledge discovery in experimental or clinical data or in large databases structured hierarchically such as ontologies. Short descriptions of key methods are provided in Box 1. The icons at the right of each main application field symbolize respective typical machine learning methods, that is, from top to bottom: (1) SVM where the grouping (classification) is obtained by placing a border (hyperplane) between classes (subsymbolic classifier), (2) a decision tree where the classification is obtained through hierarchical rules (symbolic classifier), (3) an emergent self-organizing maps as an unsupervised machine learning method able to find an interesting structure in high-dimensional data such as clusters. In this figure, the map was colored using a geographical analogy with brown (up to snow-covered) heights and green valleys, on which clusters can be separated (from Ref. 36). Finally, (4) a directed acyclic graph is drafted depicting the polyhierarchy of, for example, the functions of pain-relevant genes (from Ref. 69). CART, classification and regression tree; DAG, directed acyclic graph; DT, decision tree; ESOM, emergent self-organizing map; HMM, hidden Markov models; k-NN, k nearest neighbor; LVQ, learning vector quantization; MLP, multilayer perceptron; PCA, principal component analysis; SVM, support vector machine.Although statistics can be regarded as a branch of mathematics, artificial intelligence and machine learning have developed from computer science (Ref. 58; see also https://en.wikipedia.org/wiki/Artificial_intelligence). The initial definition of artificial intelligence originates from Alan Turing who proposed an experiment where 2 players, who can either be human or artificial, try to convince a human third player, that they are also humans.68 The test of artificial intelligence is passed if the third player cannot tell who is the machine. Important steps in the development of machine learning were the first creation of the computer learning program, which was a checker game,54 and the first neural network called the perceptron.53 Statistics uses mathematical equations to model probability relationships between data variables, whereas machine learning learns from data without the necessity of previous knowledge. It aims at optimization and performance of an algorithm rather than on the analysis of the probabilities of observations, given a known underlying data distribution. Nevertheless, both machine learning and statistics techniques are working in concert for pattern recognition, knowledge discovery, and data mining and share partly the same methods such as regression, which is used widely in statistics but is also considered as a classification method in machine learning (Fig. 1). In the present research context, when provided with pain-related data, machine-learned methods are able to learn a mapping of complex features to a known class, that is, to predict a pain phenotype class from a complex pattern of acquired parameters. After the machine has learned the prediction of a pain-related phenotype, the algorithm can subsequently be used on new data from which the class membership of a novel yet unclassified subject can be identified. However, machine learning methods can also be used for pattern recognition in complex pain-related data to reveal traces of an underlying molecular background or for knowledge discovery in big data in a drug discovery or repurposing context. The increasing use of contemporary methods of computational science is reflected in the rising number of reports using machine learning for pain research (Table 1). This review is focused on machine-learned technologies applied to general pain research that allow one to analyze and predict pain phenotypes and to obtain knowledge from experimental and clinical pain-related data.Table 1: Reports of pain research in the order of the year of publication, where machine-learned methods were used.Table 1-A: Reports of pain research in the order of the year of publication, where machine-learned methods were used.2. Pain research involving machine learning A literature search was conducted using PubMed at https://www.ncbi.nlm.nih.gov/pubmed on July 22, 2017 for “([machine-learn*] OR machine learn*) AND pain.” One hundred ten results published between 2002 and 2017 with an increasing number of publications over time were obtained (Table 1), and a few more reports were obtained from reference tracking. After elimination of editorials, reviews, repeated reports of the same machine-learned analysis, 88 original reports of the use of machine learning in a pain context were identified. Twenty-two articles that regarded pain only as a symptom interesting in another context such as chest pain as a diagnostic criterion for pneumonia10 or coronary syndromes4 or phantom limb pain as an indicator or prosthesis functioning1 were excluded as well as 14 reports about neuroimaging of pain, a topic that has been reviewed separately.26,39 This resulted in 52 reports that were analyzed for the use of several different methods of machine learning in pain research (Table 1). For a short description of the mentioned machine learning methods, please refer to Box 1.Text box 1. Definitions and descriptions of key methods of machine learning most frequently used so far in the pain research context (Table 1 and Fig. 1). For a detailed description of these and further machine learning methods, see, for example, Ref. 11,43 (1) Classification solves the problem of identifying to which category (diagnosis) a new case belongs, based on a training data set of data containing cases whose category is known. (2) A Bayes classifier minimizes the probability of misclassification, given the prerequisites of the theorem of Bayes, that is, distributions and (conditional) probabilities. (3) Decision tree methods output a tree-structured graph consisting of variables (features) in the decisions nodes (points of split) and conditions in the edges. (4) Random forests use a multitude of simple decision trees usually based on a random selection of a small set of features. (5) Projection methods represent the data space in a lower-dimensional space with the aim of conserving important structural properties. (6) Focusing projection methods are learned using a function of the neighborhood of points in the data space. (7) K nearest-neighbors (k-NN) methods use k (classified) prototypes to which new cases are assigned depending on their distances to all prototypes. (8) Artificial neural networks (ANNs) are computer programs that operate a multitude of simple processing elements (neurons), which are connected to each other by (weighted) synapses. (9) Multilayer perceptrons are ANN, where the connections are structured in layers. The neurons are of the McCulloch-Pitts type, that is, nonlinear decisions using hyperplanes are used. (10) Support vector machines are multilayer perceptron with 1 layer using McCulloch-Pitts type of neurons, where the input data are projected into a (possible infinite dimensional) vector space in which (1) scalar products are easily computable and (2) the decision surface can be more complex than simple hyperplanes. (11) A self-organizing map (SOM) is an unsupervised learning ANN producing a 2-dimensional discretized representation of the data space through a focusing projection. (12) Emergent SOM are SOMs able to show emergent structures in the form of (U-, P-, and U*-) matrix representations, which display structural features of the data space using a geographical map metaphor. (13) Knowledge in data science is a symbolic representation of taxonomic categorizations and decisions using an algorithmic treatable (ie, decidable or provable) part of natural human language, such as (a subset of) predicate logic, with the requirement to generalize to unseen data. (14) Ontologies use data science knowledge in the form of a naming and definition of the terms and semantic interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. (15) Ontology directed acyclic graphs are graph-based representations of a polyhierarchy of terms of an ontology. 3. Pain phenotype prediction from complex case data Machine learning addresses the so-called data space including an input space X comprising vectors xi = <xi,1,…xi,d> with d > 0 different parameters (variables and features24) acquired from n > 0 cases. In supervised machine learning, algorithms enable a mapping of the input parameters xi to the output classes yi in the data space . The information consisting of several biomedical parameters is used to derive a mapping that allows assigning future cases to the right class (prediction and generalization Ref. 11), for example, the pain phenotype group or a clinical diagnosis. The main types of classifiers provided by supervised machine learning are symbolic45 or subsymbolic63 classifiers. In symbolic classifiers, the decision how a classification is obtained can be interpreted by a domain expert as a combination of conditions on the features. For example, a symbolic45 classifier composed of a decision tree was created to predict patient-controlled analgesia consumption from approximately 30 acquired features including demographic (age, sex, and weight), biomedical (eg, blood pressure, diabetes, and arterial hypertension), surgery-related therapy (eg, type of surgery, duration, and details of anesthesia) and analgesic-related therapy (eg, consumption of analgesics before the surgery and dose demands during the first 24 hours after surgery) parameters.27 Importantly, for each of the parameters, the value range underlying the decision with respect to analgesics demands remained accessible (see Tables 1 and 11 in Ref. 27). In decision trees, the features are also weighted according to their importance (most important first). Another example of a symbolic classifier is the creation of a Bayesian diagnostic tool from demographic-, pain-, and surgery-related parameters for the prediction of persistence of pain in a breast cancer surgery context. It provided a sensitivity and specificity of 33% and 95%, respectively.61 Again, the classification procedure was accessible to direct interpretation through the Bayesian decision limits calculated for the single parameters. In subsymbolic classifiers, a better performance of a machine learned algorithm is sought by waiving the possibility of understanding the details, that is, it is impossible to obtain biomedical explanations for the functioning of the algorithm. For example, random forests use hundreds or thousands of simple decision trees that escape interpretation; the classification is obtained through the complete set of trees, that is, the “forest.”6 Such a classifier was created from various stool-based markers to diagnose a bladder pain syndrome.5 Similarly, a projection method for high-dimensional data, specifically, minimum curvilinear embedding, was applied to obtain from complex proteomics data a clustering of patients with neuropathic pain from controls and a further separation of different types of neuropathy such as neuropathy associated with amyotrophic lateral sclerosis and peripheral neuropathy with or without or pain.8 Machine-learned algorithms were further applied to predict thermal pain sensitivity from bioresponses acquired through electromyography, skin conductance level, and electrocardiography.23 Specifically, using support vector machines (SVMs56), individual pain threshold and tolerance to thermal stimulation could be predicted from the noninvasive measurements at accuracies of >91% and 79%, respectively.23 This aimed at obtaining information about pain in subjects with verbal and/or cognitive impairments in whom queries of pain such as standard visual rating scales cannot be applied. Moreover, predicting which patients required high opioid doses for analgesia, based on a next-generation sequencing–derived opioid receptor genotype, was achieved with a subsymbolic classifier based on k-nearest neighbors calculations.34 Another application of subsymbolic classifiers has been implemented as neural networks. The so-called elastic net regression models and SVMs predicted pain scores measured between 40 and 120 minutes after the administration of 10 mg oxycodone from interpolated pain score values before drug administration.46 The elastic net regression model provided pain scores that had a correlation coefficient of 0.6 with the observed scores. 4. Structure detection in complex pain-related data Detecting structures in the d-dimensional data space pointing at patterns or subgroups accessible to biomedical interpretation is a typical application of unsupervised machine learning. In contrast to the supervised learning setting, the class information Y is absent or ignored; the task is to find “interesting” data structures that can be interpreted as subgroups (clusters and strata) in the studied cases or made accessible for biomedical interpretation by domain experts, including the discovery of new knowledge in data-driven research approaches. For example, in a data matrix comprising several quantitative sensory testing (QST) parameters acquired from healthy subjects, a pattern was detected allowing one to identify a subgroup of healthy subjects who reacted to hypersensitization with topical capsaicin with a shift in QST parameters that resembled the parameter pattern observed in patients with neuropathic pain.35 Similarly, in a set of pain phenotype data comprising responses to experimental heat, cold, mechanical, and electrical pain stimuli applied in 125 healthy subjects, structures were detected using unsupervised machine learning implemented as emergent self-organizing maps.71 These data structures could be associated with a complex genotype composed of 30 reportedly pain relevant variants in 10 genes, which was able to correctly identify 80% of the subjects as belonging to an extreme pain phenotype in an independent and prospectively assessed cohort of 89 other subjects.38 5. Knowledge discovery and exploration of pain-related data Machine learning methods can be used to explore data sets by reversing the analytical focus of classifier building and pattern detection. Supervised machine learning methods qualify for data exploration under the assumption that if a biomedical parameter qualifies for inclusion in a classifier, then it is probably important for the addressed pain-related problem. In contrast to classic statistical methods, where knowledge or at least presumptions about the distributions and/or functional dependencies of the data are necessary, machine learning methods allow for data-driven research approaches. Hence, techniques of feature selection, which are common in machine learning, enable one to identify relevant modulators of pain-related outcomes in data-driven and hypothesis-free explorative research approaches. For example, a machine-learned analysis identified, among hundreds of biomedical parameters, demographic-, psychological-, and pain-related parameters as the most relevant for explaining the persistence of pain in women who underwent breast cancer surgery.61 Moreover, unsupervised machine learning methods can be used to assess, at a whole-study level, whether the acquired biomedical parameters demonstrate the efficacy of a treatment applied during a research project. The rationale is to detect data structures that are congruent with a known preclassification such as the presence of a modulator of the pain phenotype. For example, after treatment of 82 subjects with local UV-B irradiation or capsaicin application and assessing the pain phenotype using 10 different QST parameters, a 246 × 10-sized data matrix was obtained in a human experimental pain study.36 Using unsupervised machine learning implemented as emergent self-organizing maps,71 data structures were detected that coincided with applied known treatments indicating that a modulation of the complex pain phenotype had been obtained.36 A machine learning algorithm consisting of a classification and regression tree analysis was applied to 8034 independent observations of baseline thermal nociceptive sensitivity in mice.9 The analysis identified the mouse genotype as predictive of the pain phenotype; however, it also revealed that the experimenter performing the test and additional laboratory factors including season/humidity, cage density, time of day, sex, and within-cage or order of testing modulated the pain phenotypes.9 Finally, natural language progressing methods,73 which combine linguistics with computer science to analyze human language in speech or written text, were used to extract signs from clinical notes using, such as the occurrence of terms, for example, keywords that hint at a clinically incident, in a document.48 Prediction accuracy of this method for the patient's pain level was reported to be better than 99%. 6. Limitations of machine learning in pain research Machine learning is vulnerable to overfitting and may end up in describing noise or irrelevant relationships rather than the true relationship between features and classes. In that case, only the actual data on which the mapping has been learned are successfully classified, but the algorithm fails to classify new data. This can be addressed by building the classifier on a training data set and testing its performance on a test data set obtained in a separate experiment or through splitting the available data, and/or by cross validation using creating data subsets randomly resampled from the original data sets. Furthermore, machine learning may be fooled by data sets containing dominant but irrelevant features. A classic example is the training of a neuronal network to recognize camouflaged tanks hidden in trees.13 The network was apparently successfully trained with a set of photographs of tanks in trees and just trees without tanks. However, in a new set of photographs of trees with or without tanks hidden among them, the neuronal network failed. It turned out that in the training set, photographs of camouflaged tanks had been taken on cloudy days, whereas photographs of trees without tanks had been taken on sunny days. The neural network had learned to recognize the weather rather than distinguishing tanks among trees. In the new set of photographs, forests with and without tanks had been photographed in the same weather; hence, a neuronal network merely able to distinguish the weather was unable to identify tanks. Furthermore, applications of machine learning in pain research may be limited by the availability and quality of data; it depends on the maintenance of knowledge bases or on the success of enrolling the necessary large number of subjects in clinical studies. The latter has become easier, thanks to funding activities of concerted large-scale pain research projects.33 However, even the analysis of apparently large data sets can quickly be confronted with small sample problems when data structure detection results in many subgroups of small sizes. Then, the rather typical setting of many more features than cases poses challenges on a valid data analysis. Possibly, generative machine learning methods17 are able to reduce this problem. Such models range from Gaussian mixture models as a simple form of a generative model up to more complex approaches such as generative adversarial networks,20 generative restricted Boltzman machines,62 or generative emergent self-organizing neuronal networks.70 7. Conclusions The emerging discipline of computational pain research provides contemporary tools to understand pain. This discipline uses computer-based processing of complex pain-related data and relies on “intelligent” learning algorithms. By extracting information from complex pain-related data and generating knowledge from this, information will be facilitated. Therefore, machine learning has the ability to influence the study and treatment of pain profoundly. Indeed, the application of machine learning for pain research–related nonimaging problems has been mentioned in publications in scientific journals since 2002 (Table 1). Among machine learning methods,11,43 a subset has so far been applied to pain research–related problems (Fig. 1), SVMs, regression models, and several kinds of neural networks so far most frequently mentioned in the pain literature. Machine learning receives increasing general interest and appears to penetrate many parts of daily life and natural sciences. This tendency is likely to extend to pain research. The present review aims to acquaint pain domain experts with the methods and current applications of machine learning in pain research, possibly facilitating the awareness of the methods in current and future projects. Conflict of interest statement This work has been funded by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 602919 (“GLORIA”, J.L.) and by the Landesoffensive zur Entwicklung wissenschaftlich-ökonomischer Exzellenz (LOEWE), LOEWE-Zentrum für Translationale Medizin und Pharmakologie (J.L.). The authors have declared that no further conflicts of interest exist.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call