The authors have declared that no competing interests exist.
Conceived and designed the experiments: SOM HR AV IO ER JM IJ PL. Performed the experiments: SOM IO PL. Analyzed the data: SOM AV IO MJS CA PL. Wrote the paper: SOM AV IO MJS CA PL.
The clinical investigation of human brain tumors often starts with a non-invasive imaging study, providing information about the tumor extent and location, but little insight into the biochemistry of the analyzed tissue. Magnetic Resonance Spectroscopy can complement imaging by supplying a metabolic
Non-negative matrix factorization techniques have recently shown their potential for the identification of meaningful sources from brain tissue spectroscopy data. In this study, we use a convex variant of these methods that is capable of handling negatively-valued data and generating sources that can be interpreted as tumor class prototypes. A novel approach to convex non-negative matrix factorization is proposed, in which prior knowledge about class information is utilized in model optimization. Class-specific information is integrated into this semi-supervised process by setting the metric of a latent variable space where the matrix factorization is carried out. The reported experimental study comprises 196 cases from different tumor types drawn from two international, multi-center databases. The results indicate that the proposed approach outperforms a purely unsupervised process by achieving near perfect correlation of the extracted sources with the mean spectra of the tumor types. It also improves tissue type classification.
We show that source extraction by unsupervised matrix factorization benefits from the integration of the available class information, so operating in a semi-supervised learning manner, for discriminative source identification and brain tumor labeling from single-voxel spectroscopy data. We are confident that the proposed methodology has wider applicability for biomedical signal processing.
Brain tumors have a relatively low incidence amongst humans as compared to other more widespread cancer pathologies. The clinical investigation of an abnormal mass in the brain frequently starts with its non-invasive characterization, typically with a Magnetic Resonance Imaging (MRI) study. This is widely used for determining the tumor extent for surgical and radiotherapy planning and for the post-therapy monitoring of tumor recurrence or progression to higher grade. MRI can provide an initial diagnosis of an intracranial mass lesion with variable sensitivity and specificity depending on tumor type
MRS has already been used in computer-based systems for diagnostic decision support
The MRS data analyzed in the current work are
Previous research has attempted to separate the MRS constituent source signals by applying Independent Component Analysis (ICA)
In
In a subsequent study
The proposed methodology to guide the separation of the constituent source signals with the use of prior knowledge involves a three-stage approach.
First, a reliable estimation of a probabilistic classifier, from which the probability density function (i.e. probability of class membership) generates a Fisher Information (FI) metric
The second step is to map the original data onto a Euclidean projective space so that NMF techniques can be applied. This is done with Multidimensional Scaling methods, by which the spectral points are projected onto a new coordinate space in such a way that the pairwise distances are accurately replicated, so that new data distribution has the same distance structure as the original spectral data, when measured with FI metric. Typical methods are Sammon mapping
The final step is the application of Convex-NMF for source identification. This implementation is standard but applies to the data in the Euclidean projective space, whose structure captures class discrimination as defined by the probabilistic classifier. Therefore, unlabeled data can be also projected onto the projective space, so positioning themselves in the neighborhood of spectra with similar properties with respect to the probabilistic classification. As this methodology benefits from both supervised and unsupervised modeling stages, we term it semi-supervised.
The remainder of the paper is organized as follows:
The use of the multicenter data in this study is covered by the original ethical approval obtained by the IRB in each center participating in data collection. In particular, every patient or an authorized relative signed an informed consent form specifically allowing use of his or her data for future scientific research, not just for the original study
The data analyzed in this study are single-voxel proton MR spectra (SV 1H-MRS) acquired
This signal acquisition parameter, the time of echo, is used to alter the relative contrast of spectral peaks according to their decay times, so resulting in spectra with different acuity for the detection of specific metabolic peaks. In particular, STE is more sensitive to metabolite signals with short T2 (a MR relaxation time parameter) values, for example, signals from mobile lipids, in addition to which peaks are mostly positive in
Class labeling was performed according to the World Health Organization (WHO) system for diagnosing brain tumors by histopathological analysis of a biopsy sample. The modeled data set included measurements at LTE from 20 astrocytomas grade II (A2), 78 glioblastomas (GL) and 31 brain metastases (ME) and at STE from 22 A2, 86 GL and 38 ME. Data were pre-processed as described in
A further test data set for validation purposes was acquired in three medical centers: Centre Diagnòstic Pedralbes (CDP), Institut d'Alta Tecnologia (IAT) and Institut de Diagnòstic per la Imatge (IDI)-Badalona in Barcelona, Spain. This independent data set was acquired as part of the EU-funded eTUMOUR research project
The A2 cases are low-grade, grade II on a scale I–IV of the WHO classification
Four STE cases selected from the INTERPRET dataset that illustrate the heterogeneity of the GL group, with I0145 showing a necrotic pattern, and I1098 showing an actively proliferating behavior, similar to that of I1041, its low-grade counterpart. These selected cases also illustrate the similarities of I0145 and I0211, which are highly correlated to each other, but are tumor types with different histopathological origins.
In this work, the FI measures the change in information about a conditional probability
where
This definition is the data space equivalent of the more commonly used FI which is about the information carried by the model parameters. In both cases, the FI is derived from a Taylor expansion of the information
The motivation behind the choice to calculate the FI with respect to the covariates is to directly obtain a dissimilarity measure for comparing spectra using information about their predicted classification. This provides a principled definition of a metric in data space. However this is a local differential metric
measuring the distance between two neighboring points
Our choice of estimator of
After estimating the class membership probability, the distance between two points
The path
However, this metric space is not flat, in the sense that its metric differs from point to point, therefore many commonly used methods from signal processing cannot be applied unless the data are mapped onto a Euclidean space. To do this while retaining the distance structure generated by the FI matrix requires the application of Multidimensional Scaling methods, which includes the following algorithms:
This algorithm is used to analyze multivariate data by projecting the data points from an original high-dimensional observed space to a space of lower dimensionality
A random initialization is usually followed by optimization by gradient descent.
The same fundamental concept of preserving the values of pairwise distances after projection of the original pairwise distances can apply an alternative cost function
This is the standard multidimensional scaling algorithm, which is the reason why it is abbreviated here as MDS. This algorithm is applied in this paper since it is the simplest multidimensional scaling method.
This algorithm, abbreviated in this paper as IMA, expresses the mapping from an original
where
In NMF methods, the data matrix
Convex-NMF is the algorithmic variant considered in this study, where the source matrix is also factorized into a non-negative mixture of the original data points,
The constraints of non-negativity are implemented through the use of multiplicative updating algorithms for the key matrices
where
As proposed in
NMF methods unavoidably converge to local minima. The extracted NMF bases will be slightly different for different initializations. In this study, K-means clustering was applied as proposed in
In our view Convex-NMF is especially well suited to the analysis of MRS data for the following two reasons:
The factorization of the source matrix means that Convex-NMF does not require any
Restricting
As shown in
As outlined in the introduction, the purpose of this study is to investigate the potential of using prior knowledge derived from class membership of the spectra to assist the extraction of tissue type-specific MRS signal sources. The methodology proposed involves three main stages and, in a nutshell, can be described as follows:
(i) Definition of a FI metric to model pairwise similarities and dissimilarities between data points, using a MLP classifier to estimate the conditional probabilities of class membership.
(ii) Approximation of the empirical data distribution in a Euclidean projective space in which NMF-based techniques can be applied.
(iii) Application of Convex-NMF for the source decomposition of the data.
The experiments of this study involve four approaches: 1) Fully unsupervised extraction of the MRS sources, using Convex-NMF; and 2-4) Semi-supervised extraction of the MRS sources, using, in turn, Sammon mapping, MDS, and IMA, prior to the use of Convex-NMF. With this we aimed to, first, compare the performance of the unsupervised and semi-supervised approaches and, second, compare different alternative semi-supervised approaches. Different problems of brain tumor type classification were considered for experimentation, paying special attention to the quality of the sources obtained and the accuracy of the results.
For the unsupervised approach (see
General representation of the unsupervised and semi-supervised approaches analyzed in this study for extracting specific MRS sources in human brain tumors.
For the semi-supervised approaches (see also
An independent test set (the eTUMOUR cases) was used to further validate the generalization capabilities of the obtained sources to label new cases, that is, the capability of correctly labeling unseen, out-of-sample, data cases.
The experiments involved three tumor types from MRS acquired both at STE and LTE. Firstly, we attempted binary classification for three different brain tumor diagnostic problems, namely A2
Firstly, two source signals were calculated for each classification problem, using the different approaches under study, i.e. fully unsupervised using Convex-NMF, and semi-supervised using the three dataset-projection methods mentioned before (Sammon, MDS, and IMA) prior to Convex-NMF. The fully unsupervised method aimed to extract the constituent tissue types involved in each classification problem, while the semi-supervised ones aimed to extract the
The quality of the sources was determined in terms of how similar they are compared to the mean spectrum of the corresponding class. Similarity was assessed using the correlation between the resulting sources and mean spectra of the classes (tumor types) involved. Calculating the correlation provided us with an indicator of the extent to which each source is tumor-type specific.
The accuracy of the labeling process (for all the methods and diagnostic problems used to assess source extraction) was measured as the ratio of correctly classified cases out of the total number of instances. The balanced error rate (BER)
In this section, we compile and present all the experimental results. The objective of the experiments carried out for this study was twofold: first, the assessment of the ability of the proposed methodology to extract tissue type-specific MRS sources more accurately than previous fully unsupervised approaches and; second, the evaluation of the former as a basis to produce more robust classifiers.
Sources extracted for all the classification problems using the training data at STE, for two of the approaches: Convex-NMF (unsupervised), and IMA + Convex-NMF (semi-supervised). The blue spectra indicate the mean of the classes involved. Horizontal axis, for all plots: frequency in ppm scale. Vertical axis, for all plots: UL2 normalized intensity. The range of the vertical scales is fixed for each experiment and is the same for comparative purposes.
Sources extracted for all the classification problems using the training data at LTE, for two of the approaches: Convex-NMF (unsupervised), and IMA + Convex-NMF (semi-supervised). The blue spectra indicate the mean of the classes involved. Axes labels and representation as in
STE Convex | STE Sammon Convex | STE MDS Convex | STE IMA Convex | LTE Convex | LTE Sammon Convex | LTE MDS Convex | LTE IMA Convex | ||
A2 |
A2 | 0.988 | 0.741 | 0.935 | 0.994 | 0.977 | 0.947 | 0.997 | 0.995 |
GL | 0.979 | 0.999 | 1.000 | 0.999 | 0.607 | 0.999 | 1.000 | 0.999 | |
A2 |
A2 | 0.988 | 0.931 | 0.915 | 0.981 | 0.990 | 0.981 | 0.995 | 0.986 |
ME | 0.994 | 0.999 | 0.998 | 1.000 | 0.872 | 0.994 | 0.999 | 0.993 | |
GL |
GL | 0.972 | 0.999 | 1.000 | 0.999 | 0.776 | 0.997 | 0.998 | 1.000 |
ME | 0.994 | 0.994 | 0.998 | 0.999 | 0.831 | 0.995 | 0.996 | 0.998 |
Table cells should be read as the correlations between the sources and the average spectra (see
We next report the results of the unsupervised labeling process: That is, the assignment of class labels (tumor types) to each of the cases using the extracted sources. Bear in mind that in the proposed semi-supervised approaches of this study, class labels are used only to aid the source extraction, but the final labeling process remains unsupervised.
Convex-NMF | Sammon + Convex-NMF | MDS + Convex-NMF | IMA + Convex-NMF | ||
A2 |
Total | 88.0% (95/108) | 97.2% (105/108) | 97.2% (105/108) | 98.1% (106/108) |
A2 | 100.0% (22/22) | 100.0% (22/22) | 100.0% (22/22) | 95.5% (21/22) | |
GL | 84.9% (73/86) | 96.5% (83/86) | 96.5% (83/86) | 98.8% (85/86) | |
BER | 0.076 | 0.017 | 0.017 | 0.029 | |
A2 |
Total | 96.7% (58/60) | 96.7% (58/60) | 98.3% (59/60) | 100.0% (60/60) |
A2 | 100.0% (22/22) | 100.0% (22/22) | 100.0% (22/22) | 100.0% (22/22) | |
ME | 94.7% (36/38) | 94.7% (36/38) | 97.4% (37/38) | 100.0% (38/38) | |
BER | 0.026 | 0.026 | 0.013 | 0.000 | |
GL |
Total | 75.8% (94/124) | 81.5% (101/124) | 83.1% (103/124) | 90.3% (112/124) |
GL | 70.9% (61/86) | 77.9% (67/86) | 77.9% (67/86) | 94.2% (81/86) | |
ME | 86.8% (33/38) | 89.5% (34/38) | 94.7% (36/38) | 81.6% (31/38) | |
BER | 0.211 | 0.163 | 0.137 | 0.121 |
Summary of the labeling accuracy obtained for the training set, for all the discrimination problems at STE. They include the accuracy (total and by tumor type); the number of correctly labeled samples from the total, in parentheses; and BER of the classification. The highest total accuracy and the lowest BER for each classification problem are underlined.
Convex-NMF | Sammon + Convex-NMF | MDS + Convex-NMF | IMA + Convex-NMF | ||
Total | 55.1% (54/98) | 98.0% (96/98) | 98.0% (96/98) | 98.0% (96/98) | |
A2 | 100.0% (20/20) | 95.0% (19/20) | 95.0% (19/20) | 95.0% (19/20) | |
GL | 43.6% (34/78) | 98.7% (77/78) | 98.7% (77/78) | 98.7% (77/78) | |
BER | 0.282 | 0.031 | 0.031 | 0.031 | |
A2 |
Total | 80.4% (41/51) | 100.0% (51/51) | 100.0% (51/51) | 100.0% (51/51) |
A2 | 100.0% (20/20) | 100.0% (20/20) | 100.0% (20/20) | 100.0% (20/20) | |
ME | 67.7% (21/31) | 100.0% (31/31) | 100.0% (31/31) | 100.0% (31/31) | |
BER | 0.161 | 0.000 | 0.000 | 0.000 | |
GL |
Total | 60.6% (66/109) | 76.1% (83/109) | 97.2% (106/109) | 95.4% (104/109) |
GL | 60.3% (47/78) | 69.2% (54/78) | 100.0% (78/78) | 97.4% (76/78) | |
ME | 61.3% (19/31) | 93.5% (29/31) | 90.3% (28/31) | 90.3% (28/31) | |
BER | 0.392 | 0.186 | 0.048 | 0.061 |
Summary of the labeling accuracy obtained for the training set, for all the discrimination problems at LTE. They include the accuracy (total and by tumor type); the number of correctly labeled samples from the total, in parentheses; and BER of the classification. Highest total accuracy and lowest BER underlined as in
Convex-NMF | Sammon + Convex-NMF | MDS + Convex-NMF | IMA + Convex-NMF | ||
Total | 80.0% (32/40) | 77.5% (31/40) | 80.0% (32/40) | 85.0% (34/40) | |
A2 | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | |
GL | 73.3% (22/30) | 70.0% (21/30) | 73.3% (22/30) | 80.0% (24/30) | |
BER | 0.133 | 0.150 | 0.133 | 0.100 | |
A2 |
Total | 90.0% (18/20) | 85.0% (17/20) | 85.0% (17/20) | 90.0% (18/20) |
A2 | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | |
ME | 80.0% (8/10) | 70.0% (7/10) | 70.0% (7/10) | 80.0% (8/10) | |
BER | 0.100 | 0.150 | 0.150 | 0.100 | |
GL |
Total | 62.5% (25/40) | 55.0% (22/40) | 62.5% (25/40) | 70.0% (28/40) |
GL | 63.3% (19/30) | 56.7% (17/30) | 63.3% (19/30) | 73.3% (22/30) | |
ME | 60.0% (6/10) | 50.0% (5/10) | 60.0% (6/10) | 60.0% (6/10) | |
BER | 0.383 | 0.467 | 0.383 | 0.333 |
Summary of the labeling accuracy obtained for the test set, for all the discrimination problems at STE. They include the accuracy (total and by tumor type); the number of correctly labeled samples from the total, in parentheses; and BER of the classification. Highest total accuracy and lowest BER underlined as in
Convex-NMF | Sammon + Convex-NMF | MDS + Convex-NMF | IMA + Convex-NMF | ||
Total | 40.0% (16/40) | 65.0% (26/40) | 67.5% (27/40) | 65.0% (26/40) | |
A2 | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | |
GL | 20.0% (6/30) | 53.3% (16/30) | 56.7% (17/30) | 53.3% (16/30) | |
BER | 0.400 | 0.233 | 0.217 | 0.233 | |
A2 |
Total | 70.0% (14/20) | 75.0% (15/20) | 75.0% (15/20) | 75.0% (15/20) |
A2 | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | 100.0% (10/10) | |
ME | 40.0% (4/10) | 50.0% (5/10) | 50.0% (5/10) | 50.0% (5/10) | |
BER | 0.300 | 0.250 | 0.250 | 0.250 | |
GL |
Total | 80.0% (32/40) | 80.0% (32/40) | 82.5% (33/40) | 82.5% (33/40) |
GL | 86.7% (26/30) | 86.7% (26/30) | 90.0% (27/30) | 90.0% (27/30) | |
ME | 60.0% (6/10) | 60.0% (6/10) | 60.0% (6/10) | 60.0% (6/10) | |
BER | 0.267 | 0.267 | 0.250 | 0.250 |
Summary of the labeling accuracy obtained for the test set, for all the discrimination problems at LTE. They include the accuracy (total and by tumor type); the number of correctly labeled samples from the total, in parentheses; and BER of the classification. Highest total accuracy and lowest BER underlined as in
STE, Training set | STE, Test set | LTE, Training set | LTE, Test set | ||
A2 |
Total | 90.7% (98/108) | 90.0% (36/40) | 79.6% (78/98) | 60.0% (24/40) |
A2 | 95.5% (21/22) | 100.0% (10/10) | 100% (20/20) | 100.0% (10/10) | |
GL | 89.5% (77/86) | 86.7% (26/30) | 74.4% (58/78) | 46.7% (14/30) | |
BER | 0.075 | 0.067 | 0.128 | 0.267 | |
A2 |
Total | 96.7% (58/60) | 85.0% (17/20) | 88.2% (45/51) | 85.0% (17/20) |
A2 | 100.0% (22/22) | 100.0% (10/10) | 100.0% (20/20) | 100.0% (10/10) | |
ME | 94.7% (36/38) | 70.0% (7/10) | 80.6% (25/31) | 70.0% (7/10) | |
BER | 0.026 | 0.150 | 0.097 | 0.150 |
Summary of the labeling accuracy obtained for the training and test set when three sources were calculated in a fully unsupervised way (Convex-NMF), for two discrimination problems at STE and LTE. They include the accuracy (total and by tumor type); the number of correctly labeled samples from the total, in parentheses; and BER of the classification.
STE, Training set | STE, Test set | LTE, Training set | LTE, Test set | ||
A2 |
Total | 89.7% (131/146) | 86.0% (43/50) | 77.5% (100/129) | 60.0% (30/50) |
Unsupervised | A2 | 95.5% (21/22) | 100.0% (10/10) | 100.0% (20/20) | 100.0% (10/10) |
AG | 88.7% (110/124) | 82.5% (33/40) | 73.4% (80/109) | 50.0% (20/40) | |
BER | 0.079 | 0.088 | 0.133 | 0.250 | |
A2 |
Total | 97.9% (143/146) | 84.0% (42/50) | 97.7% (126/129) | 66.0% (33/50) |
Semi-supervised | A2 | 100.0% (22/22) | 100.0% (10/10) | 100.0% (20/20) | 100.0% (10/10) |
AG | 97.6% (121/124) | 80.0% (32/40) | 97.2% (106/109) | 57.5% (23/40) | |
BER | 0.012 | 0.100 | 0.014 | 0.213 |
Summary of the labeling accuracy obtained for the training and test set when three sources were calculated in a fully unsupervised way, and a semi-supervised way (IMA+Convex-NMF), for the discrimination problem A2
STE, Source 1 | STE, Source 2 | STE, Source 3 | LTE, Source 1 | LTE, Source 2 | LTE, Source 3 | |
GL | 15.1% (13/86) | 51.2% (44/86) | 33.7% (29/86) | 50.0% (39/78) | 29.5% (23/78) | 20.5% (16/78) |
ME | 5.3% (2/38) | 73.7% (28/38) | 21.1% (8/38) | 32.35 (10/31) | 45.2% (14/31) | 22.6% (7/31) |
Representation of the three sources to the two tumor types (GL and ME) involved. They include the percentage of cases mainly represented by each source (by tumor type), and the number of cases from the total, in parentheses. Sources were extracted in an unsupervised mode using Convex-NMF for the aggressive tumors group (GL + ME), using the training data at both STE and LTE.
In a previous study
The results reported in
With respect to the acquisition conditions, the extracted sources seemed to perform similarly in average at both TE, according to the correlations between the average spectra of the tumor types involved and the sources (
Regarding the use of different data projection approaches (Sammon, MDS, IMA): in at least one case, the source-class correlation is low for one of the methods that include class information in the source extraction process, namely A2 in the A2
The classification results obtained using training data (
The increase on the accuracy of classification of the test dataset (
The labeling accuracy results for the training data of the A2 class, as reported in
Given that the A2 test set is smaller (10 spectra both at STE and LTE), we consider that the results reported in
Other studies have addressed similar problems in the existing literature, for similar data. We report next some of these results for comparative purposes, although the techniques and the evaluation criteria involved are not always the same and, therefore, not straightforwardly comparable.
In
GL
Up to this point, only the results corresponding to two extracted sources have been discussed. In
This is the reason why three signal sources were calculated in the discrimination problems A2
However, when comparing the unsupervised results obtained with three sources (
In the problem of discrimination between GL and ME, as mentioned in the methods section, it is unclear whether more than two sources would be required, and what they would represent. To illustrate this, three sources were calculated in an unsupervised mode for the aggressive group (GL+ME), as seen in
Three sources extracted in unsupervised mode, using Convex-NMF, for the aggressive tumors group (GL + ME), using the training data at both STE and LTE. Axes labels as in
Another classification problem of interest in the literature that involves the tumor types under study is the discrimination between A2 from the superclass AG. When using three sources for this discrimination problem, a semi-supervised approach is able to provide much better results for the training set than the unsupervised approach, with 97.9
The experimental results reported in this study confirm the hypothesis that an unsupervised method ideally suited for source extraction from MRS, namely Convex-NMF, can benefit from the use of the available data class labels to obtain tumor type-specific sources that result in accurate classifiers without any loss in the interpretability of the results.
A novel mechanism to perform non-negative matrix factorization in a semi-supervised manner is provided, by first finding a natural metric to describe the class assignments, and then mapping the data using standard projective methods into an approximate distribution in a Euclidean space where standard projective methods of the source extraction can be applied.
For the data analyzed in this work, the proposed semi-supervised approach yielded the better classification accuracies, both in the training and test datasets, if two sources were employed. Moreover, when interpreted as class prototypes, the extracted sources were of higher quality than those calculated using the unsupervised method. The results were more similar between unsupervised and semi-supervised source extraction-based classification when three sources were employed. However, the semi-supervised approaches were key in problems where the unsupervised extraction of three sources is not being helpful, such as the discrimination of GL from ME. For this problem, the accuracy results obtained using the semi-supervised approach were comparable to the best reported in the literature, with the added value of the interpretability provided by the sources.
In conclusion, the improvements in classification accuracy and accuracy of sources identification, especially in complex tumor type classification problems, are the main advantages of using the additional pre-processing steps when the focus is that of finding tumor-type specific MRS signal sources.
The differences between unsupervised and semi-supervised methods are less apparent when three sources are identified. Theoretical approaches to defining the optimal number of sources should be the subject of further work.
Authors acknowledge the former INTERPRET and eTUMOUR European projects partners. Data providers: Dr. C. Majós (IDI), Dr. À. Moreno-Torres (CDP), Dr. J. Pujol (IAT), Dr. J. Capellades (HUGTP), Dr. F.A. Howe and Prof. J.Griffiths (SGUL), Prof. A. Heerschap (RU), Prof. L. Stefanczyk and Dr. J. Fortuniak (MUL), and Dr. J. Calvar (FLENI); data curators: Dr. A.P. Candiota, Dr. T. Delgado, Ms. J. Martín and Dr. A. Pérez (GABRMN-UAB).