Conceived and designed the experiments: JEB NC SM. Performed the experiments: JEB NC. Analyzed the data: JEB NC JY SM. Wrote the paper: JEB NC JY SM.
The authors have declared that no competing interests exist.
Pain often exists in the absence of observable injury; therefore, the gold standard for pain assessment has long been self-report. Because the inability to verbally communicate can prevent effective pain management, research efforts have focused on the development of a tool that accurately assesses pain without depending on self-report. Those previous efforts have not proven successful at substituting self-report with a clinically valid, physiology-based measure of pain. Recent neuroimaging data suggest that functional magnetic resonance imaging (fMRI) and support vector machine (SVM) learning can be jointly used to accurately assess cognitive states. Therefore, we hypothesized that an SVM trained on fMRI data can assess pain in the absence of self-report. In fMRI experiments, 24 individuals were presented painful and nonpainful thermal stimuli. Using eight individuals, we trained a linear SVM to distinguish these stimuli using whole-brain patterns of activity. We assessed the performance of this trained SVM model by testing it on 16 individuals whose data were not used for training. The whole-brain SVM was 81% accurate at distinguishing painful from non-painful stimuli (
Pain is commonly accepted to be a subjective experience
Researchers have long sought to develop a physiology-based pain assessment that does not depend on patient volitional behaviors
Recent advances in neuroimaging have provided possibilities for pain assessment that have not traditionally been available to researchers
Marquand and colleagues (2010) were the first to apply fMRI and machine learning algorithms to the problem of pain measurement
An important extension of the work of Marquand et al. would be to demonstrate that physiology-based pain assessment, using fMRI data and machine learning algorithms, can classify pain accurately without relying on self-report data from the individual tested. If, for example, an SVM model could be trained on one set of individuals, and used to accurately classify pain in different individuals, then its performance would not depend on the test subjects' self-report.
In this study, we attempted to develop an SVM model that accurately determines the presence or absence of pain, even when tested on individuals whose self-reported data were not included in the model's training. Towards this aim, we investigated the task of distinguishing non-painful heat stimuli from painful heat stimuli. The major goal of the study was to determine whether blood-oxygen-level dependent (BOLD) signal change is sufficiently consistent between individuals to potentially train a physiology-based pain classifier that performs accurately when trained on one group of subjects and tested on another. An SVM model was trained on a group of eight individuals, and used to classify pain in a separate group of eight individuals. When tested on this separate group of eight individual, the SVM was significantly more accurate than chance. In a second study, the same SVM model was further validated through test-retest reliability in an additional group of eight individuals. When tested on this additional group of eight individuals, the SVM was again significantly more accurate than chance.
Nineteen participants were recruited via advertisements posted on and around the Stanford University campus. All participants were healthy and none reported having a chronic pain condition. Procedures were approved by the Stanford University School of Medicine Institutional Review Board, and all participants provided written informed consent. Due to technical difficulties with the temperature thresholding or scan procedures, complete data were not collected for three participants; therefore, they were excluded from all analyses. The remaining 16 participants were an average age of 22.7 years (SD = 3.6), with 10 men and 6 women.
Before starting the fMRI scanning session, participants were thresholded with a thermal stimulator in order to determine individual temperatures for painful heat. Thermal stimuli were delivered to the left volar forearm via a 3×3 cm Peltier-type thermode (Medoc, North Carolina). A range of temperatures was presented, each for 30 seconds. Following each temperature presentation, the participant provided a self-report of pain on a 0–10 numerical rating scale with the following anchors: 0 (no pain), 3 (minor pain), 5 (moderate pain), 7 (intense pain that you can bear without moving), and 10 (unbearable pain). The thresholding procedures used here have been previously described in greater detail
During the fMRI sessions, heat stimuli were again presented to the left volar forearm in a block design, with 40 seconds of baseline temperature (at 26°C), followed by 30 seconds of heat stimulation. All participants completed four functional runs. In two runs, participants received hot but non-painful heat stimulation (38°C). In the other two runs, participants received painful heat stimulation (individually calibrated to elicit a pain score of seven). Each participant received a total of 14 nonpainful stimuli and 14 painful stimuli. Following each functional run, participants reported whether the stimuli presented were painful or non-painful.
FMRI data were collected on a 3.0 Tesla, whole-body scanner (GE Healthcare Discovery 750), using an 8-channel receive-only phased-array head coil. A T1-weighted fast spoiled gradient-recall scan was acquired for anatomical reference (TE = 2.0 ms, 156 slices at 1.3 mm thickness). High-order shimming
SVM model pre-processing was conducted in MATLAB (Mathworks) using SPM5 and custom software. A whole-brain pattern of the activity induced by each heat stimulus was computed as map of percent BOLD signal change. For each heat stimulus, the average percent BOLD change was calculated with the following formula: ((average stimulus signal – average baseline signal)/average baseline signal). The baseline signal consisted of the 20 seconds before each heat stimulus. The stimulus signal consisted of the final 24 seconds of each heat block, excluding the initial 6 seconds to allow for the BOLD signal to reach its peak intensity. Each of the maps of percent BOLD signal change constitutes an
Feature reduction (to avoid over-fitting of the model) was achieved by applying a gray matter mask to exclude areas that did not containing neuron cell bodies. Typically, the magnitude of pain-induced BOLD signal change is less than 1%
Eight of the sixteen participants were randomly assigned to the model training group. Randomization was performed using a computer-based list randomizer. As described previously
The trained SVM model was then used to classify pain in the eight individuals who were randomly assigned to the testing group. For each heat stimulus presented to participants in the testing group, the SVM model assigned a classification of painful or non-painful. The SVM also calculated a measure of confidence in the accuracy of each assignment. This measure of confidence was derived from the distance of each example (each map of percent BOLD change) from the separating hyperplane. The percent of accurate classifications was calculated for each participant in the testing group, as well as positive predictive value (PPV) and negative predictive value (NPV). PPV is the percent of stimuli predicted to be painful which were actually painful, while NPV is the percent of stimuli predicted to be non-painful which were actually non-painful.
To identify which brain regions significantly influenced the SVM classifier's accuracy, we conducted a permutation test as previously described
To test whether any regions might independently distinguish painful and non-painful stimuli, we conducted individual SVM classifications using small regions of interest (ROIs). Based on a meta-analysis of 68 studies, Apkarian et al. (2005) have proposed that there is a brain network for acute pain which is composed of 6 brain regions: the primary and secondary somatosensory cortices, the insular cortex, the anterior cingulate cortex, the prefrontal cortex, and the thalamus
In order to provide an additional validation of the SVM model by test-retest reliability, an additional nine participants were recruited and assigned to an independent retesting group. As with the previous groups, participants were recruited from Stanford and from the surrounding area. Due to technical difficulties with the temperature thresholding procedures, complete data were not collected for one participant; therefore, this participant's data were excluded from all analyses. The remaining eight participants were an average age of 25.9 years (SD = 3.3), with 5 men and 3 women. Participants in Study 2 followed the same procedures as those participants in Study 1 (with the exception that no randomization was conducted because all participants were assigned to a single retest group). The average temperature for the painful stimulation was 46.0°C (SD = 1.0). The SVM model trained in Study 1 was used to classify painful and non-painful stimuli in the participants recruited to Study 2.
As a validity check on the effectiveness of the chosen temperatures to elicit painful and non-painful sensations, we first examined self-reported pain. All included participants reported that the experimental temperature that was thresholded to a 7 out of 10 pain score elicited pain, and that the 38°C temperature did not.
The SVM model, which was trained on data from participants in the training group, performed significantly better than chance when distinguishing painful from non-painful stimuli in participants from the independent testing group (
Participant | Accuracy (%) | Positive PV (%) | Negative PV (%) |
1 | 75.0 | 100 | 66.7 |
2 | 85.7 | 91.7 | 81.2 |
3 | 82.1 | 80.0 | 84.6 |
4 | 100.0 | 100.0 | 100 |
5 | 71.4 | 71.4 | 71.4 |
6 | 96.4 | 93.3 | 100 |
7 | 85.7 | 85.7 | 85.7 |
8 | 96.4 | 100 | 93.3 |
Average | 86.6±10.4* | 90.3±10.5* | 85.4±12.3* |
For each participant in the testing group, the SVM was used to distinguish the painful stimuli from the nonpainful stimuli. For each participant in the testing group, and for their group average, this table displays the SVM's overall accuracy, and positive and negative predictive value (PV). Error is reported as 1 standard deviation. An asterisk indicates performance measures that are significantly greater than chance (
We next examined whether classification accuracy could be improved by incorporating a confidence threshold, measured as distance from the separating hyperplane. BOLD maps not meeting the confidence threshold were excluded on the basis of insufficient evidence to make a confident classification. Overall accuracy of the SVM classifier increased monotonically with the number of stimuli excluded (
The classifier's performance was assessed at increasing distance thresholds. As the distance threshold increased, an increasing number of stimuli were excluded on the grounds that stimuli nearest the separating hyperplane were most likely misclassified. In this figure, performance is plotted as a function of the percentage of stimuli that have been excluded from classification. Dotted lines display the performance computed at each distance threshold. Solid lines display a third degree polynomial fit to those data.
Next, a permutation test was used to determine which brain regions were most involved in driving the whole-brain SVM classifier's performance. The classification of a stimulus as painful was significantly influenced (
A permutation test was run to determine which brain regions significantly affected the whole-brain SVM classification. This figure illustrates brain regions that fall within the 90th percentile of the null distribution that was determined by permutation. Regions in the 99th percentile (
Because using information from a single ROI to assess pain would be simpler than employing a whole-brain SVM, we next tested whether BOLD signal from individual ROI's could distinguish painful and non-painful stimuli as accurately as the whole-brain SVM model. Activity in the secondary somatosensory cortex classified painful stimuli significantly better than chance (
Region | Accuracy (%) | Positive PV (%) | Negative PV (%) |
ACC | 56.7±12.1 | 57.7±17.5 | 57.1±11.1 |
Insula | 64.3±10.1* | 66.2±11.8* | 63.3±9.1* |
PFC | 50.0±12.2 | 46.2±16.9 | 53.2±12.7 |
S1 | 54.0±15.7 | 46.2±30.7 | 556.3±16.2 |
S2 | 71.9±12.4* | 75.2±13.9* | 71.2±13.1* |
Thalamus | 59.8±11.9 | 58.0±14.7 | 61.8±12.1* |
Using the activity from six regions of interest (ROIs), SVMs were used to distinguish the painful stimuli from the nonpainful stimuli. For each ROI, this table displays the SVM's average accuracy, and positive and negative predictive value (PV) when tested on participants in the testing group (N = 8). Error is reported as 1 standard deviation. An asterisk indicates performance measures that are significantly greater than chance (
An additional group of eight participants was investigated to determine test-retest reliability of the SVM model in classifying painful stimuli. As seen in
Participant | Accuracy (%) | Positive PV (%) | Negative PV (%) |
1 | 64.3 | 66.7 | 62.5 |
2 | 85.7 | 77.8 | 100.0 |
3 | 64.3 | 70.0 | 61.1 |
4 | 71.4 | 100.0 | 63.6 |
5 | 67.9 | 66.7 | 69.2 |
6 | 92.9 | 87.5 | 100.0 |
7 | 60.7 | 100.0 | 56.0 |
8 | 89.3 | 100.0 | 82.4 |
Average | 74.6±12.7* | 83.6±15.2* | 74.4±17.6* |
For each participant in the re-testing group, the SVM from study 1 was used to distinguish the painful stimuli from the nonpainful stimuli. For each participant in the retesting group, and for their group average, this table displays the SVM's overall accuracy, and positive and negative predictive value (PV). Error is reported as 1 standard deviation. An asterisk indicates performance measures that are significantly greater than chance (
In this study, we establish the feasibility of physiology-based pain detection using BOLD fMRI data and supervised machine learning algorithms. An SVM model, trained on 8 individuals, was 80.6% accurate at distinguishing painful from non-painful stimuli when tested on 16 individuals whose self-report data were not used in training.
BOLD activity in five brain regions was principally responsible for the SVM classifier's performance at distinguishing painful from non-painful stimulation. Increased activity in the primary somatosensory cortex, secondary somatosensory cortex, insular cortex, and primary motor cortex was predictive of painful stimulation. Increased activity in other areas of the primary motor cortex and in the pregenual anterior cingulate cortex was predicative of nonpainful stimulation. These five areas are consistent with prior literature that identifies critical pain processing regions of the human brain
Because the SVM model was powered by a relatively small set of brain regions, we were interested to know if activity in any one brain region could classify pain equally as well as the whole-brain approach. When tested, we found that an SVM, using recordings of activity in the secondary somatosensory cortex, performed significantly greater than chance at classifying pain and better than any of the other regions tested. Our findings are consistent with the secondary somatosensory cortex being the region which is most often reported to activate during painful stimulation
We further found that the accuracy of the SVM classifier could be enhanced by employing distance from the separating hyperplane as a measure of the classifier's confidence. Greater distance from the separating hyperplane was indicative of greater confidence in the SVM classification. By taking this information into account, each classification was associated with a probability of its accuracy, and the SVM classifier's overall accuracy was increased, at the cost of excluding some stimuli on the basis of ambiguity.
While this study was designed to probe the use of physiology-based pain detection, the results also more largely suggest that the brain's neural representation of pain is robust and replicable across individuals. We found that the SVM classifier performed more accurately than chance when applied to study participants in both the test group and the retest group. This finding shows that across individuals, pain-induced BOLD signal changes are considerably similar with regard to both spatial location and absolute magnitude, measured in units of percent BOLD signal change. Therefore, while there may be considerable individual differences in the experience of pain and in patterns of brain activity induced by pain, there are nonetheless a core set of pain-induced responses in the brain that may be universal, at least when considering discreet thermal pain stimuli.
We are still very far from a physiology-based pain assessment tool that could be used in clinical, forensic, and other applied settings. However, we see the goal of an accurate, valid surrogate for self-reported pain as both attainable and worthy of effort. There are several areas where the method reported here for detecting pain can be improved. We outline five specific tasks below.
First, supervised machine learning algorithms should be used in conjunction with fMRI to extend the approach reported here by investigating pain intensity, and by distinguishing brain activity related to stimulus intensity from brain activity related to pain intensity. The potential of using fMRI and machine learning algorithms to measure pain intensity has been demonstrated using a within-person analysis
Second, using fMRI and machine learning algorithms, future experiments should develop physiology-based pain assessments that perform accurately across sensory modalities. While recent work has demonstrated that a major component of the brain regions activated by pain are also activated by non-painful somatosensory stimuli
Third, supervised machine learning algorithms should be developed that can distinguish pain from affective conditions that induce patterns of brain activity that are similar to those induced by pain. While previous research suggests that many of the brain regions that were most involved in driving the SVM's performance are associated with the sensory dimensions of pain such as pain intensity and localization
Fourth, SVM accuracy at classifying pain should be increased by incorporating various physiological and trait-based measurements. Sources of physiologic information such as skin conductance
Fifth, future experiments should develop fMRI-based machine learning algorithms that can measure chronic pain. We have shown here that it is feasible to classify transient pain experiences by comparing the period of stimulation to a preceding pain-free rest period. While this is a major development, the method does not easily translate to chronic pain assessment because in patients with chronic pain, it is difficult to obtain a pain-free rest condition. More complex measurements of brain activity, for example, temporal covariance of the activity between regions, have been shown to correlate with pain perception
There are many machine learning algorithms and thus, many alternatives to SVM classification when using multi voxel pattern analysis and fMRI data to assess pain. As we have done here, other groups have used linear classifiers, such as SVMs and Fisher's linear discriminant, to distinguish two or more cognitive states using patterns of brain activity
In conclusion, without relying on self-report from tested subjects, we demonstrate that in a controlled experimental setting, whole-brain patterns of brain activity can be used to assess whether a heat stimulus is painful. The results suggest that to advance the development of a physiology-based pain measure, neuroimaging methods can benefit from incorporating machine learning techniques, and from deeper investigation of the complex interplay of brain regions in mediating the experience of pain.
We thank Hanna Michelsen and Hoameng Ung for technical assistance and Catherine Chang, Ph.D., for discussion. We also thank Ian Carroll, M.D., M.S., Catherine Chang, Ph.D., Andrea Crowell, M.D., Gary Glover, Ph.D., Fumiko Hoeft, M.D., Ph.D., Jiang-Ti Kong, M.D., Honglak Lee, Ph.D., Rebecca McCue, Patricia Rohrs, and Andrew Saxe, for reading the manuscript.