Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Multi-site, multi-platform comparison of MRI T1 measurement using the system phantom

  • Kathryn E. Keenan ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    Kathryn.keenan@nist.gov

    Affiliation National Institute of Standards and Technology, Boulder, Colorado, United State of America

  • Zydrunas Gimbutas,

    Roles Formal analysis, Software, Validation, Writing – review & editing

    Affiliation National Institute of Standards and Technology, Boulder, Colorado, United State of America

  • Andrew Dienstfrey,

    Roles Data curation, Formal analysis, Validation, Writing – original draft, Writing – review & editing

    Affiliation National Institute of Standards and Technology, Boulder, Colorado, United State of America

  • Karl F. Stupic,

    Roles Conceptualization, Data curation, Methodology, Project administration

    Affiliation National Institute of Standards and Technology, Boulder, Colorado, United State of America

  • Michael A. Boss,

    Roles Conceptualization, Methodology, Visualization, Writing – review & editing

    Affiliation American College of Radiology, Center for Research and Innovation, Philadelphia, Pennsylvania, United State of America

  • Stephen E. Russek,

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation National Institute of Standards and Technology, Boulder, Colorado, United State of America

  • Thomas L. Chenevert,

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Writing – review & editing

    Affiliation University of Michigan, Ann Arbor, Michigan, United State of America

  • P. V. Prasad,

    Roles Conceptualization, Data curation, Investigation, Methodology, Writing – review & editing

    Affiliation NorthShore University Health System, Evanston, Illinois, United State of America

  • Junyu Guo,

    Roles Investigation, Methodology

    Affiliation St. Jude Children’s Research Hospital, Memphis, Tennessee, United State of America

  • Wilburn E. Reddick,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation St. Jude Children’s Research Hospital, Memphis, Tennessee, United State of America

  • Kim M. Cecil,

    Roles Conceptualization, Formal analysis, Methodology, Project administration, Writing – review & editing

    Affiliation Cincinnati Children’s Hospital Medical Center, University of Cincinnati College of Medicine Cincinnati, Ohio, United State of America

  • Amita Shukla-Dave,

    Roles Investigation, Methodology, Project administration, Writing – review & editing

    Affiliation Memorial Sloan Kettering Cancer Center, New York, New York, United State of America

  • David Aramburu Nunez,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Memorial Sloan Kettering Cancer Center, New York, New York, United State of America

  • Amaresh Shridhar Konar,

    Roles Methodology, Validation, Writing – review & editing

    Affiliation Memorial Sloan Kettering Cancer Center, New York, New York, United State of America

  • Michael Z. Liu,

    Roles Data curation, Investigation, Methodology, Validation

    Affiliation Columbia University Medical Center, New York, New York, United State of America

  • Sachin R. Jambawalikar,

    Roles Data curation, Investigation, Methodology, Project administration

    Affiliation Columbia University Medical Center, New York, New York, United State of America

  • Lawrence H. Schwartz,

    Roles Data curation, Investigation, Supervision

    Affiliation Columbia University Medical Center, New York, New York, United State of America

  • Jie Zheng,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliation Washington University in St. Louis, St. Louis, Missouri, United State of America

  • Peng Hu,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliation University of California, Los Angeles, California, United State of America

  •  [ ... ],
  • Edward F. Jackson †

    † Deceased.

    Roles Conceptualization, Data curation, Investigation, Methodology, Project administration, Resources

    Affiliation University of Wisconsin, Madison, Wisconsin, United State of America

  • [ view all ]
  • [ view less ]

Abstract

Recent innovations in quantitative magnetic resonance imaging (MRI) measurement methods have led to improvements in accuracy, repeatability, and acquisition speed, and have prompted renewed interest to reevaluate the medical value of quantitative T1. The purpose of this study was to determine the bias and reproducibility of T1 measurements in a variety of MRI systems with an eye toward assessing the feasibility of applying diagnostic threshold T1 measurement across multiple clinical sites. We used the International Society of Magnetic Resonance in Medicine/National Institute of Standards and Technology (ISMRM/NIST) system phantom to assess variations of T1 measurements, using a slow, reference standard inversion recovery sequence and a rapid, commonly-available variable flip angle sequence, across MRI systems at 1.5 tesla (T) (two vendors, with number of MRI systems n = 9) and 3 T (three vendors, n = 18). We compared the T1 measurements from inversion recovery and variable flip angle scans to ISMRM/NIST phantom reference values using Analysis of Variance (ANOVA) to test for statistical differences between T1 measurements grouped according to MRI scanner manufacturers and/or static field strengths. The inversion recovery method had minor over- and under-estimations compared to the NMR-measured T1 values at both 1.5 T and 3 T. Variable flip angle measurements had substantially greater deviations from the NMR-measured T1 values than the inversion recovery measurements. At 3 T, the measured variable flip angle T1 for one vendor is significantly different than the other two vendors for most of the samples throughout the clinically relevant range of T1. There was no consistent pattern of discrepancy between vendors. We suggest establishing rigorous quality control procedures for validating quantitative MRI methods to promote confidence and stability in associated measurement techniques and to enable translation of diagnostic threshold from the research center to the entire clinical community.

Introduction

Quantitative magnetic resonance imaging (qMRI) offers exciting prospects for disease detection, diagnosis, characterization, assessment of treatment response, and other applications without the need for tissue biopsy. Early work focused on T1 relaxation times to categorize different brain tumors, particularly distinguishing benign from malignant tumors. Bydder et al. observed that T1 of malignant tumors was higher than that of benign tumors [1]. Motivated by Bydder’s work, several groups tried to reproduce this observation, but had limited success due in part to technical variations [26]. Using the qMRI techniques available at the time, these groups found that T1 of pathologic entities/non-healthy tissue (e.g., edema, tumor) had a wide range of values, implying that T1 value would be an unreliable indicator of pathologic process or tumor grade. As a result of inconsistent findings regarding clinical value of tissue-inherent T1 values in early studies, quantitative T1 measurements were not routinely used to study tumors for many years, whereas subjective interpretation of T1-weighted imaging serves as a mainstay of clinical MRI, particularly since the introduction of exogenous contrast agents.

Recent innovations in qMRI measurement methods led to improvements in accuracy, repeatability, and acquisition speed, and have prompted renewed interest to reevaluate the medical value of quantitative T1. For example, using magnetic resonance fingerprinting, two studies found that the T1 relaxation times of glioblastoma multiforme were substantially higher compared to low grade gliomas, thus again suggesting that T1 can distinguish malignant tumors from benign tumors [7,8]. Furthermore, international consortia such as the Quantitative Imaging Biomarker Alliance (operating under the Radiological Society of North America) and the European Imaging Biomarker Alliance (sponsored by the European Institute for Biomedical Imaging Research) actively promote projects on qMRI standards and best practices for using qMRI in the clinic. These projects emphasize the use of standard objects or phantoms to assess reproducibility of measurement methods and then determine quantitative thresholds, similar to using T1 relaxation time to distinguish the grade of glioma.

Nevertheless, there remain challenges to isolate and mitigate technical sources of variability from immutable biological sources that combine to create overall variability in T1 measurements. Bojorquez et al. catalogued the broad ranges of T1 relaxation times reported in the literature for normal tissues at 3 T and observed dependence of reported T1 on the measurement method and/or MRI system [9]. Similarly, in vivo measurement studies using multiple MRI systems have varied results. Lee et al. measured T1 relaxation time in vivo across two vendor systems using a variable flip angle technique and observed high test-retest repeatability within a vendor system, but significant differences in T1 between vendors [10]. When the measurement methodology is more highly controlled, the inter-site coefficient of variation is less than 10% [11,12]. While these results are encouraging, customized pulse sequences were used across different scanner software versions in these studies [11,12], which is not representative of the typical clinical setting. This level of control is difficult to implement in multisite clinical trials and is currently not feasible for clinical settings where diverse hardware, software, and imaging protocols are to be expected.

To distinguish biological variability from technical sources that include MRI system hardware, pulse sequence design, acquisition parameters, and data reduction algorithm, a physical phantom, rather than in vivo measurements, should be used as stable reference standards for “true values” [13]. Several groups have studied T1 across measurement methods and hardware (e.g., scanner, coils) using phantoms with known T1 values in a range suitable for T1 measurement of cardiac tissue [14,15], white matter [16], or multiple tissues [1719]. Some multi-site studies had an uneven distribution of vendor systems, which can adversely impact generalization of results. For example, Bane et al. and vanHoudt et al. both observed a site-specific dependence on the T1 measurement that may be dependent on the distribution of systems included in their studies [17,19].

The purpose of this study was to determine the variability in T1 measurements on a variety of MRI systems to ascertain the feasibility of applying diagnostic threshold T1 measurements across multiple clinical sites. We used the International Society of Magnetic Resonance in Medicine/National Institute of Standards and Technology (ISMRM/NIST) system phantom [20] to assess variations of T1 measurements across MRI systems at 1.5 tesla (T) (two vendors, with number of MRI systems n = 9) and 3 T (three vendors, n = 18).

Methods

Image acquisition

Two ISMRM/NIST system phantoms from the same production run were imaged at multiple sites on systems from three vendors (General Electric (GE) Healthcare Systems, Waukesha, WI, USA; Siemens Healthcare, Erlangen, Germany; and Philips, Best, The Netherlands) at 1.5 tesla and 3 tesla using head coils with 8 to 32 channels (Table 1). At 1.5 T, there were four GE Medical Systems and five Siemens systems, and at 3 T there were six GE Medical Systems, five Philips, and seven Siemens systems.

The two phantoms included in this study were prepared in collaboration between NIST and CaliberMRI (Boulder, CO, USA) using solutions prepared by NIST. The two phantoms were precision machined using identical protocols and contained the same solutions. The large number of samples with prescribed concentration variations allows for identification and elimination of defective samples. The phantoms were shipped via overnight service between sites after imaging was complete. Table 1 indicates which phantom was imaged at each location. The focus of this study was the NiCl2 array (previously called the T1 array) in the ISMRM/NIST system phantom. The NiCl2 array was chosen since it has a smaller temperature and field dependence than other available reference arrays [20]. The NiCl2 array contains 14 spheres that are doped with varying concentrations of NiCl2 to achieve a progression of T1 values from approximately 20 ms to 2000 ms at 1.5 T. The reference T1 times at 1.5 T and 3 T were determined using the NMR-based relaxation time measurement service provided by NIST. These measurements are traceable to the international system of units and values and have a 3σ uncertainty of less than 1.5% (the real value has a > 99.7% probability of being within ± 1.5% of the reference value). Measurement details are available [21].

MRI-based T1 relaxation time was measured using two methods: inversion recovery (IR) using 2D fast spin echo inversion recovery, and variable flip angle (VFA) using 3D fast spoiled gradient echo. Detailed parameters defining the scan protocols are provided in Table 2 for IR and Table 3 for VFA. In addition to the details in Tables 2 and 3, sites were given detailed instructions, including photos of the phantom in a head coil and example images to convey the phantom placement and imaging protocols. For VFA data, participants were instructed to set signal gains by performing a prescan using a 15-degree flip angle; system settings were fixed for subsequent scans to the extent possible. Potential variable signal scaling across series was accounted for in image analysis [22].

thumbnail
Table 3. Variable flip angle (VFA) measurement protocols.

https://doi.org/10.1371/journal.pone.0252966.t003

The protocol did not require that the phantom be placed in the scan room for temperature equilibration prior to measurement. The phantom temperature was measured before and after imaging using a NIST-traceable, calibrated thermometer (Control Company, Friendswood, TX, USA) placed within the phantom by removing the top screw of the phantom. Incorrect temperature measurement (e.g., measuring the temperature of the room rather than the temperature of the phantom) did not require reacquisition of the data. Temperature changes are not expected to impact our study, as T1 times for NiCl2 are known to be relatively insensitive to temperature over the range 16°C to 26°C, and the 10 highest NiCl2 concentration spheres have less than ± 4% variation over this range [20].

Image analysis and selection of regions-of-interest

Two observers performed centralized quality control on all submitted data to ensure adherence to the prescribed imaging protocol with both observers reviewing all data. Deviation from the acquisition protocol resulted in submission rejection (e.g., incorrectly setting the signal gains for the VFA experiment). Sites were encouraged to repeat the image acquisition correctly; four image sets were initially rejected and then properly acquired.

We used special-purpose, automated segmentation software to identify the 14 spheres containing T1 samples (“sample spheres”) and then select the regions-of-interest (ROIs) for analysis (Fig 1). We performed this segmentation in the shortest inversion time (TI) image in the IR image stack, as this image generally provided the most contrast between sample spheres and the phantom background (water). Likewise, the protocol required that the VFA scans take place immediately after the IR scans with no repositioning of the phantom. Thus, the ROIs determined for the IR measurement were the same in the VFA data analysis from the same scan session.

thumbnail
Fig 1. An example coronal slice of the ISMRM/NIST system phantom through the NiCl2 array and resulting segmentation.

(A) The shortest inversion time image used for identification of sample spheres and (B) the segmentation with sample sphere centers identified.

https://doi.org/10.1371/journal.pone.0252966.g001

The NiCl2 (i.e., T1) array consists of 14 spheres, each with an inner radius of 7.5 mm. For each of these spheres, the regions of interest were defined as the collection of pixels within each sphere and well-separated from the boundary. Previous publications describe the details of the ROI identification algorithm [18] and [20]. In brief, we applied a gradient filter to the measured image, then thresholded the result to define a binary image of region edges. Next, we used an optimization routine to determine the rigid transformation—translation and rotation—such that the sample spheres of the known phantom array covered the edge pixels determined in the first step. The results of this rigid transformation served to initialize an iterative process to refine the center of each sample sphere individually. This step accommodated geometric distortions introduced by the scanner. With the centers of all 14 sample spheres thus determined in the measurement frame, we defined the ROI as all pixels falling within 4 mm of this center point (well within the interior of each sample sphere). At the resolution of these images, the result is that each ROI consisted of approximately 52 pixels. The mean intensity value of these pixels defined the signal value corresponding to that ROI for the given TI or flip angle (IR and VFA, respectively). The ROI identification software is part of the qMRLab suite [23,24] and can be provided by the authors upon request. The data in this study will be available at doi:10.18434/mds2-2357.

Prior to T1 data analysis, we rescaled images from Philips systems as specified by Chenevert et al. [22]. The segmentation code and T1 data analysis code were written and performed using MATLAB (The MathWorks, Inc., Natick, MA, USA).

T1 data analysis

Inversion recovery and variable flip angle are two qMRI protocols for T1 measurement. In both protocols, T1 arises as a parameter in a model for the measured MR signal intensities as a function of an experimental variable—inversion time (TIk) in IR experiments and flip angle (αm) for VFA [25].

The measurement model for the IR experiment is (1)

Here yk is the measured signal at the k-th inversion time, M0 is the initial magnitude of the magnetization signal, and nk represents measurement noise. In addition to TIk, the fixed experimental parameters are: TR, the relaxation time, and θ180, θ90, the flip angles. In principle, for a given set of TIk and associated values yk, one could attempt to invert the above equation for all parameters. However, as our objective is to estimate T1 alone, and we combine terms and fit the IR signal to a general exponential model: (2)

Here, the constants A and B are required for mathematical consistency but may not have a physical interpretation in all cases. Fitting data using non-linear least squares is a natural approach as it corresponds to the maximum likelihood estimator in the case that the noise variables nk are independent, identically distributed Gaussians. However, the absolute value appearing in the IR signal model entails a loss of differentiability at measurement points where the signal is near zero. To avoid this, we modified the objective and solved the following non-linear least squares problem to estimate T1, A and B: (3)

We solved this smooth problem via Newton iterative refinement of an initial guess found by a search over a dense grid in the three-dimensional parameter space (T1, A, and B). Note that the residuals (Eq 3) were never zero due to measurement noise and also to signal not accounted for by the model. We ran Newton iterations until the changes in the residuals were orders of magnitude less than the residuals themselves. In principle, one could use the stationary point of the smooth problem as an initial guess for the original, non-smooth problem involving absolute values. Generally, we found the T1 values to not be substantially different. However, this could be a topic for future investigation.

The analysis of VFA data proceeded along similar lines. In this case, we modeled the measured MRI signal as a function of flip angle by the Ernst equation (see [26] or, for example, [27]) (4) where zm is the measured signal at the flip angle αm, TR is a fixed experimental parameter, nm is measurement noise, and M0 is the signal corresponding to the ROI equilibrium magnetization. Once again, estimates of T1 and M0 are determined by non-linear least squares minimizing the sum: (5)

As above, we determined initial values of T1 and M0 by grid search and refined these by Newton iteration.

Statistical methods

We compared the T1 measurements from IR and VFA scans to phantom reference values obtained by NIST’s MRI Biomarker Measurement Service based on gold-standard NMR [21]. This service provides measurements with less than 1.5% error traceable to the international system of units; we refer to these NMR measurements as “true values” [28] and indicate them by T1,NMR. We used Analysis of Variance (ANOVA) to test for statistical differences between T1 measurements grouped according to MRI scanner manufacturers and static field strengths. We referred to such groupings as “vendor” and “field” respectively. We performed all analyses using the Statistical Toolbox within MATLAB (The MathWorks, Inc., Natick, MA, USA).

As true values of T1 span two orders of magnitude, we performed our analysis on normalized errors to create a uniform scale for all measurements. For each ROI in the NiCl2 array, we define the normalized measurement error as (6)

We conducted all hypothesis tests on various pooled averages of this normalized deviation.

Our statistical analysis tested the null hypothesis that the mean normalized measurement errors were the same for all groups. The hypothesis test for normalized group mean differences was performed using the anovan function in MATLAB. A two-way ANOVA analysis indicated significant interactions between vendor and field grouping variables. As a result, we used a simple main effects model [2931], considering the data from the two field values (1.5 T and 3 T) separately. We analyzed the pairwise differences between group means using the multcompare command with Tukey-Kramer’s honestly significant difference statistics. The confidence level for all statistical tests was α = 0.05.

Results

The IR method had minor deviations from the NMR-measured T1 value at both 1.5 T and 3 T (Figs 2 and 3). At both field strengths, the IR method both over- and underestimated the true T1 as indicated by the positive and negative bias in the figures. At 1.5 T, there were no statistically significant differences between vendors (Table 4). At 3 T, Vendor E is biased higher than Vendors C and D with significant differences (Table 5) over a true T1 range of 65 ms to 2033 ms. This range of T1 times spans multiple tissue types, including white matter, grey matter, muscle, myocardium, prostate, and fibroglandular tissues.

thumbnail
Fig 2. Inversion recovery measurements at 1.5 T.

The inversion recovery (IR) measurements at 1.5 T both over- and underestimated the T1,NMR. The circles represent the within group means, and the error bars are 95% confidence intervals about these means. The IR measurements, especially in the range of physiological T1 values (~250 ms for adipose tissue to 1800 ms for grey matter) are biased approximately 5% high. Both vendors exhibited this bias; there are no significant differences between them throughout the entire range of T1 times spanned by the ISMRM/NIST phantom array (Table 4).

https://doi.org/10.1371/journal.pone.0252966.g002

thumbnail
Fig 3. Inversion recovery measurements at 3 T.

At 3 T, the inversion recovery (IR) measurements generally overestimated the T1,NMR. The circles represent the within group means, and the error bars are 95% confidence intervals about these means. There were no differences between vendors C and D. By contrast, vendor E is biased almost 10% higher than vendors C and D for T1 values in the physiologically relevant range. Please see Table 5 for tests of significance.

https://doi.org/10.1371/journal.pone.0252966.g003

The VFA measurements of T1 exhibited substantially more bias and less reproducibility than using IR. The relative errors for each field strength and vendor are shown in Figs 4 and 5. Note that the vertical axes for these plots span twice the range as for the corresponding IR figures. At 1.5 T, VFA has a broader range of deviation than IR, but the only significant differences between vendors A and B occur at very short T1 times (Table 4). By contrast, at 3 T, the VFA measurements for vendor D are significantly different than the other two vendors (C, E) for most of the samples throughout the clinically relevant range (examples of physiological values are given in Fig 6). The bias is unpredictable as vendor D underestimates the T1 value while vendors C and E overestimate it. Finally, there is a variation in the errors correlated with spatial position of the ROIs situated within the phantom. This effect manifests as an oscillation visible in VFA measurements for all field values and vendors. However, it is most pronounced at 3 T for vendor D. The four samples with the shortest T1 values are arranged in a square grid in the center of the phantom, and the remaining ten samples are placed in a circle around the outside of the phantom (Fig 1). The vendor D sample with the largest underestimation of T1 is located approximately at the “chin” (Fig 1; sample spheres 5–7).

thumbnail
Fig 4. Variable flip angle measurements at 1.5 T.

The variable flip angle (VFA) measurements at 1.5 T had a broader range of deviations than the IR measurements (Fig 2), and again both over- and underestimated the T1,NMR. The circles represent the within group means, and the error bars are 95% confidence intervals about these means. There were significant (95% CI) differences between Vendors A & B for the two shortest T1 relaxation times; however, the T1 relaxation time of those spheres is below those values typically measured in the body.

https://doi.org/10.1371/journal.pone.0252966.g004

thumbnail
Fig 5. Variable flip angle measurements at 3 T.

At 3 T, the variable flip angle (VFA) measurements had a much broader range of deviations than the IR measurements (Fig 3). The circles represent the within group means, and the error bars are 95% confidence intervals about these means. Vendors C and D and D and E are significantly (95% CI) different for many spheres; p-values are given in Table 5. Vendors C and E generally overestimated the T1,NMR, while vendor D underestimated it. Finally, we observe a pattern in the vendor D deviation: The greatest deviation (largest underestimation) is for samples with T1 relaxation times 260 ms, 368 ms, 514 ms, which are located in the “chin” of the phantom (Fig 1, sample spheres 5–7).

https://doi.org/10.1371/journal.pone.0252966.g005

thumbnail
Fig 6. Reported tissue properties at 3 T.

Physiological values of normal and diseased tissue from [79]. Unless otherwise noted by a superscript, the reference is [9].

https://doi.org/10.1371/journal.pone.0252966.g006

Finally, we illustrate how these vendor differences could potentially impact clinical diagnostics. Consider a scenario in which T1 measurements are used to distinguish between low grade glioma (LGG) and glioblastoma multiforme (GBM). In a previous study, de Blank et al. indicated that at 3 T, LGG tissue can be characterized as having a T1 of 1355 ms ± 187 ms whereas GBM tissue has a T1 of 1863 ms ± 70 ms [8]. The range of T1 times associated with these tissues are shown in Figs 6 and 7. This range of T1 times is approximately covered by spheres 1 and 2 of the NiCl2 array (2033 ms and 1489 ms, respectively). For T1 times spanned by these two spheres, we assume that the relative bias and dispersion are constant for all measurement modalities and vendors. From Fig 3, for IR measurements at 3 T, we estimated these relative biases and dispersions to be: 2% positive bias for vendors D and E, and 10% positive bias for vendor C; all vendors exhibiting a ± 7% range of dispersion. Turning to the VFA measurements at 3 T, in Fig 5 we estimated these relative biases and dispersions as: 15% negative bias for vendor D in contrast to 7% positive bias for vendors C and E; all vendors exhibiting a ± 10% range of dispersion. Applying this bias and dispersion to the T1 values reported by de Blank et al. [8] results in T1 measurements that could be expected as per our current study (details in S1 File). We plotted the expected measurements alongside the reported ranges in Fig 7. The range of errors measured using IR is small, while the range of errors measured using VFA is significantly greater. If sites using vendor E wished to implement a threshold determination between LGG and GBM using T1 IR, it could be reasonable to do so by shifting the threshold based on the observed measurement bias. Similarly, if sites using vendors C and E wished to implement the threshold using T1 VFA, it may be reasonable to shift by the observed measurement bias. However, concerning T1 measurement by VFA on vendor D, the dispersion of T1 values is so great as to make it impossible to distinguish between the LGG and GBM tissue types with any confidence. What is more concerning, if the underestimate of T1 VFA exhibited by vendor D is not taken into account, then one could inaccurately diagnose a glioblastoma as a low-grade glioma, an incorrect determination with serious impacts to patient management.

thumbnail
Fig 7. Impact of vendor differences in T1 measurement.

Here, we have plotted the reported T1 of low grade glioma and glioblastomas [8] and an estimate for each vendor system of the diagnostic range for low grade glioma and glioblastomas based on the bias and dispersion of that system. The challenge is to define a diagnostic criterion based on T1 to distinguish low grade glioma from glioblastoma that would be suitable across vendor systems. If T1 relaxation time is measured using IR (A), the overestimate of values by vendor E is small compared to the range of physiological values, and as a result, T1 measured by IR could be a reliable measure across vendor systems. However, if the VFA method is used (B), the underestimate of T1 on vendor D could inaccurately diagnose a glioblastoma as a low-grade glioma, an incorrect determination with serious impacts to patient management.

https://doi.org/10.1371/journal.pone.0252966.g007

Across all measurements, reported temperature of either the MRI room or of the bulk water in the phantom ranged from 17.1°C to 23.3°C. Previous research demonstrated that the T1 of NiCl2 solutions vary by ± 4% over this experimental range [20]. Therefore, we expect that the variation of T1 due to temperature is negligible compared to other sources of measurement error (see S1 Fig for additional details).

Discussion

This study examined two T1 methods, the reference standard (IR) and a commonly used approach (VFA) and demonstrated that quantitative MRI measurement of T1 is potentially subject to significant bias and variation. There was no consistent pattern of discrepancy between vendors, and as a result, clinicians are unable to translate a diagnostic threshold T1 value determined on one MRI system to other MRI systems. The ability to compare measured values to known T1 values in a phantom is critical for disentangling various sources of bias and variation.

We included a range of MRI systems representative of clinical practice and analyzed the deviations in measured T1 from the reference T1 values in the ISMRM/NIST system phantom. Previous studies, which found less significant variation in measured T1 across sites, used six or fewer MRI systems and were highly controlled, in some cases programming the exact same sequence across two platforms from a single vendor rather than using a product sequence [11,12,32]. Similar to studies undertaken by Bane et al. [17] and vanHoudt et al. [19], our study included multiple vendor systems and multiple systems within a vendor including product or platform variation, and software variations. This study included two vendors at 1.5 T and three vendors at 3 T with more equal representation across vendors than these previous efforts. Studies, such as this one, establish lower bounds on the range of errors that one could expect for in vivo measurements.

The largest variations and bias in T1 measurement were for VFA measurement at 3 T. We suspect that a sizable component of the error in the VFA measurement could be due to imperfect B1 fields and associated nonregular slice profile [33], as it is known that VFA measurements are very susceptible to this source of error [34,35]. Flip angle is directly proportional to B1 field strength, and relative error in T1 is approximately twice that of the relative error in flip angle. This factor of two holds as a rule of thumb over a wide range of T1, as reported by [27] and confirmed by our numerical experiments. For example, if the RF pulse implementation leads to an effective 10% under-rotation for all angles, e.g., a 20 degree flip angle is actually 18 degree and so on for all other angles in the VFA sequence, then T1 measurements would be offset by approximately 20% in the same direction, e.g., a 2000 ms T1 would be measured as approximately 1600 ms. This same relative error would occur for any other nominal T1, and over-rotation results in over-estimation of measured T1 with the same sensitivity factor. We note that the NiCl2 array is not at isocenter in the A/P direction, which can result in less homogeneous B1 and B0. B1 variation could reasonably explain the range of T1 biases observed in Fig 5 and their apparent correlation with location of the sample sphere within the scanner adds support to this theory. However, additional measurements including a B1 field map would be needed for a more conclusive analysis.

Lack of B1 maps is a primary limitation of this study. At the time of data collection, B1 mapping was not commonly available on all systems and was therefore omitted. Since this time, other groups have clearly demonstrated that T1 mapping via VFA requires a B1 map [10,16,36], and some vendor-supplied correction methods are available [37], though even recent multi-site studies were unable to implement a product B1 map sequence on all systems [32]. Without B1 maps integrated into product T1 VFA, it will be challenging to implement T1 VFA for diagnostic purposes, as demonstrated in our analysis in Fig 7.

This work sets the foundation to validate and provide traceability for advanced quantitative MRI methods. We note, one limitation of reference phantom studies is that they cannot be used to assess sensitivity of the measurement to physiological effects. Prior to in vivo work, future studies could use these reference phantoms to assess the stability of measurements to variations in sequence parameter changes (e.g., voxel sizes, matrix sizes) and to assess vendor-specific quantitative MRI methods.

Conclusion

Longitudinal relaxation time is one example of a variety of quantitative MRI parameters that are potentially measurable using clinical MRI systems. We suggest establishing rigorous quality control procedures for quantitative MRI to promote confidence and stability in associated measurement techniques and to enable translation of measurement thresholds for diagnostic, disease progression, and treatment monitoring from the research center to the entire clinical community and back. Standard phantoms that are curated and have traceable uncertainties are an important component of the rigorous quality control procedures required to validate and provide uncertainties for qMRI methods. We note that similar calls have been made previously by other researchers [38,39], and we strongly support these efforts.

Supporting information

S1 Fig. NMR-measured T1 variation with temperature.

Here we show the T1,NMR variation with temperature as a percent deviation from the T1,NMR at 20 C. Please note, these measurements are for a different batch of NiCl2 solutions than the phantoms used in this study. However, the solutions were made to the same specifications, and we believe this to be representative of the solutions in this study.

https://doi.org/10.1371/journal.pone.0252966.s001

(TIF)

S1 File. Details for calculations in Fig 7.

Here we detail the analyses and calculations that resulted in Fig 7.

https://doi.org/10.1371/journal.pone.0252966.s002

(PDF)

References

  1. 1. Bydder GM, Steiner RE, Young IR, Hall AS, Thomas DJ, Marshall J, et al. Clinical NMR imaging of the brain: 140 cases. AJR American journal of roentgenology. 1982;139(2):215–36. pmid:6979874
  2. 2. Komiyama M, Yagura H, Baba M, Yasui T, Hakuba A, Nishimura S, et al. MR imaging: possibility of tissue characterization of brain tumors using T1 and T2 values. AJNR American journal of neuroradiology. 1987;8(1):65–70. pmid:3028112
  3. 3. Just M, Thelen M. Tissue characterization with T1, T2, and proton density values: results in 160 patients with brain tumors. Radiology. 1988;169(3):779–85. pmid:3187000
  4. 4. Kjaer L, Thomsen C, Gjerris F, Mosdal B, Henriksen O. Tissue characterization of intracranial tumors by MR imaging. In vivo evaluation of T1- and T2-relaxation behavior at 1.5 T. Acta radiologica. 1991;32(6):498–504. pmid:1742132
  5. 5. Newman S, Haughton VM, Yetkin Z, Breger R, Czervionke LF, Williams AL, et al. T1, T2 and proton density measurements in the grading of cerebral gliomas. European Radiology. 1993;3:49–52.
  6. 6. Araki T, Inouye T, Suzuki H, Machida T, Iio M. Magnetic resonance imaging of brain tumors: measurement of T1. Work in progress. Radiology. 1984;150(1):95–8. pmid:6689793
  7. 7. Badve C, Yu A, Dastmalchian S, Rogers M, Ma D, Jiang Y, et al. MR Fingerprinting of Adult Brain Tumors: Initial Experience. AJNR American journal of neuroradiology. 2017;38(3):492–9. pmid:28034994
  8. 8. de Blank P, Badve C, Gold DR, Stearns D, Sunshine J, Dastmalchian S, et al. Magnetic Resonance Fingerprinting to Characterize Childhood and Young Adult Brain Tumors. Pediatr Neurosurg. 2019;54(5):310–8. pmid:31416081
  9. 9. Bojorquez JZ, Bricq S, Acquitter C, Brunotte F, Walker PM, Lalande A. What are normal relaxation times of tissues at 3 T? Magnetic resonance imaging. 2017;35:69–80. pmid:27594531
  10. 10. Lee Y, Callaghan MF, Acosta-Cabronero J, Lutti A, Nagy Z. Establishing intra- and inter-vendor reproducibility of T1 relaxation time measurements with 3T MRI. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2019;81(1):454–65. pmid:30159953
  11. 11. Gracien RM, Maiworm M, Bruche N, Shrestha M, Noth U, Hattingen E, et al. How stable is quantitative MRI?—Assessment of intra- and inter-scanner-model reproducibility using identical acquisition sequences and data analysis programs. NeuroImage. 2020;207:116364. pmid:31740340
  12. 12. Weiskopf N, Suckling J, Williams G, Correia MM, Inkster B, Tait R, et al. Quantitative multi-parameter mapping of R1, PD(*), MT, and R2(*) at 3T: a multi-center validation. Front Neurosci. 2013;7:95. pmid:23772204
  13. 13. Sullivan DC, Obuchowski NA, Kessler LG, Raunig DL, Gatsonis C, Huang EP, et al. Metrology Standards for Quantitative Imaging Biomarkers. Radiology. 2015:142202.
  14. 14. Captur G, Gatehouse P, Keenan KE, Heslinga FG, Bruehl R, Prothmann M, et al. A medical device-grade T1 and ECV phantom for global T1 mapping quality assurance-the T1 Mapping and ECV Standardization in cardiovascular magnetic resonance (T1MES) program. J Cardiovasc Magn Reson. 2016;18(1):58. pmid:27660042
  15. 15. Zhang Q, Werys K, Popescu IA, Biasiolli L, Ntusi NAB, Desai M, et al. Quality assurance of quantitative cardiac T1-mapping in multicenter clinical trials—A T1 phantom program from the hypertrophic cardiomyopathy registry (HCMR) study. Int J Cardiol. 2021;330:251–8. pmid:33535074
  16. 16. Stikov N, Boudreau M, Levesque IR, Tardif CL, Barral JK, Pike GB. On the accuracy of T1 mapping: searching for common ground. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2015;73(2):514–22. pmid:24578189
  17. 17. Bane O, Hectors SJ, Wagner M, Arlinghaus LL, Aryal MP, Cao Y, et al. Accuracy, repeatability, and interplatform reproducibility of T1 quantification methods used for DCE-MRI: Results from a multicenter phantom study. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2018;79(5):2564–75.
  18. 18. Keenan KE, Gimbutas Z, Dienstfrey A, Stupic KF. Assessing effects of scanner upgrades for clinical studies. Journal of magnetic resonance imaging: JMRI. 2019;50(6):1948–54. pmid:31111981
  19. 19. van Houdt PJ, Kallehauge JF, Tenderup K, Nout R, Zaletelj M, Tadic T, et al. Phantom-based quality assurance for multicenter quantitative MRI in locally advanced cervical cancer. Radiotherapy and Oncology. 2020. pmid:32931890
  20. 20. Stupic KF, Ainslie M, Boss MA, Charles C, Dienstfrey AM, Evelhoch JL, et al. A standard system phantom for magnetic resonance imaging. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2021. pmid:33847012
  21. 21. Boss MA, Dienstfrey AM, Gimbutas Z, Keenan KE, Kos AB, Splett JD, et al. Magnetic Resonance Imaging Biomarker Calibration Service: Proton Spin Relaxation Times. NIST Special Publication 250–97. National Institute of Standards and Technology; 2018. Report No.: 97 Contract No.: SP-250-97.
  22. 22. Chenevert TL, Malyarenko DI, Newitt D, Li X, Jayatilake M, Tudorica A, et al. Errors in Quantitative Image Analysis due to Platform-Dependent Image Scaling. Translational oncology. 2014;7(1):65–71. pmid:24772209
  23. 23. Cabana JF, Gu Y, Boudreau M, Levesque IR, Atchia Y, Sled JG, et al. Quantitative Magnetization Transfer Imaging Made Easy with qMTLab: Software for Data Simulation, Analysis, and Visualization. Concept Magn Reson A. 2015;44a(5):263–77.
  24. 24. Karakuzu A, Boudreau M, Duval T, Leppert I, Boshkovski T, Pike GB, et al. qMRLab [http://qmrlab.org.
  25. 25. Quantitative Magnetic Resonance Imaging. 1st ed: Academic Press; 2020. 1092 p.
  26. 26. Ernst FJ, Warnock RL, Wali KC. Linear and Nonlinear Mass-Difference Effects in a Model of Baryon Multiplets. Phys Rev. 1966;141(4):1354-+.
  27. 27. Helms G, Dathe H, Weiskopf N, Dechent P. Identification of signal bias in the variable flip angle method by linear display of the algebraic Ernst equation. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2011;66(3):669–77. pmid:21432900
  28. 28. Kessler LG, Barnhart HX, Buckler AJ, Choudhury KR, Kondratovich MV, Toledano A, et al. The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions. Stat Methods Med Res. 2015;24(1):9–26. pmid:24919826
  29. 29. Page MC, Braver SL, MacKinnon DP. Levine’s guide to SPSS for analysis of variance. 2nd ed: Lawrence Erlbaum Associates Publishers; 2003.
  30. 30. Dunn OJ, Clark VA. Applied Statistics: Analysis of Variance and Regression. New York: Wiley; 1974.
  31. 31. Kirk RE. Experimental Design: Procedures for the Behavioral Sciences. 3rd ed. Monterey, CA: Brooks/Cole Publishing; 1995.
  32. 32. Leutritz T, Seif M, Helms G, Samson RS, Curt A, Freund P, et al. Multiparameter mapping of relaxation (R1, R2*), proton density and magnetization transfer saturation at 3 T: A multicenter dual-vendor reproducibility and repeatability study. Hum Brain Mapp. 2020.
  33. 33. Zheng J, Venkatesan R, Haacke EM, Cavagna FM, Finn PJ, Li D. Accuracy of T1 measurements at high temporal resolution: feasibility of dynamic measurement of blood T1 after contrast administration. Journal of magnetic resonance imaging: JMRI. 1999;10(4):576–81. pmid:10508325
  34. 34. Tsai WC, Kao KJ, Chang KM, Hung CF, Yang Q, Lin CE, et al. B1 Field Correction of T1 Estimation Should Be Considered for Breast Dynamic Contrast-enhanced MR Imaging Even at 1.5 T. Radiology. 2017;282(1):55–62. pmid:27479805
  35. 35. Wang J, Qiu M, Constable RT. In vivo method for correcting transmit/receive nonuniformities with phased array coils. Magnetic resonance in medicine: official journal of the Society of Magnetic Resonance in Medicine/Society of Magnetic Resonance in Medicine. 2005;53(3):666–74. pmid:15723397
  36. 36. Lee Y, Callaghan MF, Nagy Z. Analysis of the Precision of Variable Flip Angle T1 Mapping with Emphasis on the Noise Propagated from RF Transmit Field Maps. Front Neurosci. 2017;11:106. pmid:28337119
  37. 37. Bliesener Y, Zhong X, Guo Y, Boss M, Bosca R, Laue H, et al. Radiofrequency transmit calibration: A multi-center evaluation of vendor-provided radiofrequency transmit mapping methods. Medical physics. 2019;46(6):2629–37. pmid:30924940
  38. 38. Hanson CA, Kamath A, Gottbrecht M, Ibrahim S, Salerno M. T2 Relaxation Times at Cardiac MRI in Healthy Adults: A Systematic Review and Meta-Analysis. Radiology. 2020;297(2):344–51. pmid:32840469
  39. 39. Partridge SC, Zhang Z, Newitt DC, Gibbs JE, Chenevert TL, Rosen MA, et al. Diffusion-weighted MRI Findings Predict Pathologic Response in Neoadjuvant Treatment of Breast Cancer: The ACRIN 6698 Multicenter Trial. Radiology. 2018:180273. pmid:30179110