Diffusion-weighted imaging (DWI) is extensively used in clinical radiology studies to monitor changes in water mobility that reflect altered tissue cellularity (1–3). These alterations often arise from malignancy (4–6) or in response to treatment (7–9). Quantitative parametric maps are derived on the basis of physical models for DWI signal dependence on diffusion gradient-weighting strength (denoted by b-value). A single-component diffusion model, most widely used by clinical oncology trials (7, 9, 10), assumes monoexponential DWI signal decay with increasing b-value, where the decay rate is quantified by apparent diffusion coefficient (ADC).
Diffusion kurtosis (11, 12) is a heuristic extension of the single-component model that introduces an additional quantitative parameter (apparent kurtosis coefficient, Kapp) to describe the degree of non-Gaussian deviation from monoexponential signal decay in tissue observed for certain in vivo structures and malignancies with increasing b-values (5, 13–15). These deviations are typically caused by the presence of cellular structures that substantially impede water mobility, leading to sustained DWI signal at high b-values (1, 11). Because typical diffusion kurtosis imaging (DKI) parameter fit is performed over a limited range of b-values (bmax < 3000 s/mm2), the derived diffusion and kurtosis values are “apparent” rather than absolute characteristics.
Recently there has been a surge of interest in the diffusion imaging community to evaluate Kapp as a noninvasive, surrogate biomarker of tissue microstructure (5, 13, 15–17). Unlike classic diffusion kurtosis in anisotropic brain tissue (11, 12), for nominally isotropic cancerous parenchyma, observed relatively high apparent kurtosis (0.8–1.7, for example, in head and neck or prostate and bladder cancers [5, 13–15]) is typically associated with tumor potency. To use DKI parameters as quantitative imaging biomarkers (QIBs) of tumor response to therapy in multicenter oncology trials (16, 17), the precision (repeatability) and accuracy (bias) of the potential QIBs need to be evaluated (18, 19) across multiple scanner platforms using a common scan protocol (20, 21). Construction of a novel phantom, one that provides true parameter values in the physiologically relevant ranges (5, 13–15), is the first step for the development of a repeatable multisite study protocol and the only means for the absolute bias estimate (20, 21).
The search for a viable DKI phantom has been ongoing for over a decade. The “natural” phantoms based on cream and asparagus (12, 13, 22) provide single “untunable” kurtosis parameter value and perish quickly. Synthetic phantoms comprising the polyethylene particle suspensions (23) and most recently suggested microbead impregnated gels (24) are more stable, but still suffer from limited range of provided kurtosis parameters (Kapp < 0.7) and limited precision owing to microscopic sample inhomogeneity, chemical shift (23), and/or low signal-to-noise ratio (SNR) (short T2) (24). Our recent pilot study (25) proposed the development of novel kurtosis phantoms based on lamellar (amorphous layers) and vesicular (fluid-filled microsacs) phases of liquid crystal systems. These molecular constructs are composed of hydrophobic long-chain fatty alcohols and surfactants that mimic tissue cellularity by forming regularly spaced membranous mesostructures that impede water diffusion. Altering relative concentrations of restricted and free water pools allows a broad range of tunable apparent kurtosis parameters (25) with sufficient SNR for easy quantitative DKI scan protocol testing.
The purpose of the present multi-site study was to evaluate precision, reproducibility, and long-term stability of a novel (prototype) isotropic (i)DKI phantom, fabricated using four families of chemicals based on select combinations of vesicular and lamellar mesophases of liquid crystal materials with adjustable restricted diffusion fraction. The desired iDKI phantom characteristics included long-term temporal stability and homogeneous iDKI model parameters, tunable over physiologically relevant ranges.
To guide design of the next-generation phantom toward improved stability and reproducibility, this study included the following four steps:  development and fabrication of the prototype iDKI phantom using four families of liquid crystal materials and three negative controls,  implementation of a common quantitative iDKI test–retest scan protocol,  parametric map generation and intra-scan test–retest repeatability analysis to establish measurement precision, and  apparent (water ADC-based) temperature calibration for characterization of thermal versus temporal inter-scan variability.
Isotropic Diffusion Kurtosis Imaging (iDKI) Phantoms
Four quantitative iDKI phantom materials were chemically designed based on water solutions of paired long carbon-chain surfactants (cetyltrimethylammonium bromide [CTAB] or behentriammonium chloride [BTAC]) and alcohols (cetearyl [CA] or decyl [DEC]), as well as prolipid 161 [PL161] (see details in (25); online Supplemental Figure 1). These materials formed two uniformly distributed physical compartments with distinct (several orders of magnitude different) proton diffusion rates, resulting in apparent water diffusion (Dapp) and apparent kurtosis (Kapp) at high b-values. Major differences among the tested chemical designs were in the physical origin of restricted diffusion for lamellar structures versus vesicular phase materials (25) (see online Supplemental Figure 1). Three negative control, monoexponential diffusion samples, were included based on polyvinylpirrolidone (PVP) (26) solutions in water at 0%, 20%, and 40%. All seven phantom materials were individually housed in polypropylene vials (V1–V7) of 150 mm in length and 25 mm of diameter, in a circular arrangement, submerged in water bath in a 1L plastic jar. The chemical phantom sample assignments for V1–V7 vials are provided in Table 1. The example axial-plane b = 0 image of the phantom with vial (region of interest [ROI]) labels is shown in Figure 1A. Three identical phantom prototypes were prepared using the same material batch, labeled for consistent scan geometry (see online Supplemental Figure1), and shipped to each of the participating sites. The jars were filled with tap water on-site and scanned at ambient temperature.
Multicenter iDKI Phantom Studies
The prototype quantitative iDKI phantoms were scanned at three Quantitative Imaging Network (27) centers on four MRI scanners (2 at 1.5 T and 3 T each) using shared scan protocol over a period of six months. Consistent with the clinical iDKI scan protocol, the phantom scan instructions prescribed single-shot echo-planar imaging (SS EPI) acquisition of 3 orthogonal axial DWI directions with 11 b-values (b = 0, 50, 100, 200, 500, 800, 1000, 1500, 2000, 2500, 3000 s/mm2), using a 16-channel head-coil. Other nominal acquisition parameters included the following: field of view (FOV) = 220 × 220 mm2, echo time/repetition time = shortest/10 000 ms. (Actual minimum echo time varied from 93 ms to 107 ms across system scans owing to differences in gradient settings). The acquired section of the phantom ranged between 3 and 8 slices (3–5 mm in thickness) for the sites. Minor deviations from nominal scan protocol parameters among the sites were allowed with no effect on repeatability results. Test–retest acquisitions were performed with fixed scan protocol parameters with or without phantom repositioning, anywhere from several minutes to several days apart.
All acquired data were stored and distributed in Digital Image Communication in Medicine (DICOM) format (28), and centralized analysis of multi-b trace DWI DICOM data was performed using quality control routines developed in MATLAB 7 (MathWorks, Natick, MA) (20). Noncompliant scans from two dates that had large deviation in FOV (two scans) or had high EPI susceptibility artifacts (one scan), precluded uniform ROI definition and were excluded from the analysis. The remaining ten sets of test–retest data (three from each of the 3 T and two from each of the 1.5 T scanners) and four (early) single-run acquisitions (from one 3 T scanner) were analyzed. Test–retest studies were used for intra-scan repeatability assessment, while single runs were included for intra-scan reproducibility and sample stability evaluation. Phantom temperatures were not controlled and varied with the scanner room (ambient) environment. Reference scan room temperature was recorded for four (later) study scans. One site (that provided single-run acquisitions) stored the phantom in a scan room over the course of the study, while the other two allowed the phantom to thermally equilibrate in the scan room (for one 3 T and two 1.5 T systems) for <24 hours before each scan.
Parametric Map Generation and Repeatability Analysis
The parametric maps of apparent diffusion, Dapp, and kurtosis, Kapp, (Figure 1B) were calculated using linear least square fit of voxel DWI log-signal to a quadratic function of b-value, according to the iDKI model (11, 12), Log(Sb/S0) = − Dapp · b + Kapp/6 · (Dapp · b)2. Maximum b-value allowed in the fit was constrained by bmax < 3/(Kapp·Dapp) to satisfy iDKI signal model convergence (11) and Sbmax/S0 > 0.01 (to ensure SNRbmax > 2). This yielded bmax = 1500 s/mm2 for CA-BTAC (V2), and bmax = 2000 s/mm2 for water (V4) and PL-161 (V6) vials (Figure 1, C and D). Absolute (residual) kurtosis bias of negative controls (Figure 1, A and D: V3, V4, and V5) was estimated as Kapp fit parameter deviation from zero (29) for monoexponential (zero kurtosis) diffusion materials.
Uniform areas of the b = 0 image were used to define ROIs within phantom vials, for example, avoiding susceptibility and parallel imaging artifacts. Seven circular ROIs (12 mm diameter, 155 pixels) were defined on DWI (b = 0) for phantom tubes separately for the test–retest runs, using in-house MATLAB-based tools to generate ROI statistics for repeatability estimates of the Dapp and Kapp parameters. Uniform ROI definition was noted to be challenging for V7 owing to multiple small air bubbles (Figure 1A) apparently formed within the sample volume. These air bubbles were observed to “migrate” between test–retest runs. For all scans, the defined ROI pixel locations were within ±30 mm from the magnet isocenter that minimized potential contribution of gradient system and offset-dependent DWI bias (20, 21).
Sample-specific coefficient of variance (wCV) was calculated from available test–retest studies (18, 19): wCV = /N∑i= 1N | X1 −X2|/(X1 + X2), where X1 and X2 were mean-ROI test–retest (Dapp or Kapp) parameter values, respectively, for N repeatability studies. The 95% confidence interval (CI) for an average value of measured parameter (X), was estimated as 1.96·wCV·ave(X), where the average was over all available (ten) test–retest DKI acquisitions (including less repeatable outliers) for each phantom vial. Single-acquisition 95% CI was also estimated for individual test–retest studies (N = 1) to assess systematic site and field dependencies. Bland–Altman (BA) repeatability analysis was performed for Dapp and Kapp across all test–retest samples (pool of 70). The overall BA limits of agreement (LOA) were calculated across all sample vials and test–retest scans excluding less repeatable scan “outliers.” These “outliers” were identified on the basis of test–retest value differences >1.5 interquartile ranges above the upper quartile or below the lower quartile of the 70 sample test–retest parameter difference histogram, corresponding to ±2.7 × SD for the normal error distribution (defined according to MATLAB “boxplot” default outliers).
Pearson correlation, R, was evaluated for the derived mean parameter values and their corresponding 95% CI estimates versus scan time (days from phantom manufacturing), apparent (water ADC-based) phantom temperature, and system magnetic field, to characterize the sources of variation in the measured iDKI parameters and identify materials with desired properties. Among covariates, date was not correlated to temperature, allowing independent analysis, while magnetic field had significant negative correlation to temperature (−0.64; pR = .02) as expected from dependence on scanner environment.
Water ADC-Based Apparent Phantom Temperature
Comprehensive characterization of thermal phantom properties was beyond the scope of this study; however, assessment of apparent phantom temperature (Ta) was deemed useful for discrimination between temporal and thermal origin of inter-scan variation in measured kurtosis parameters across multiple sites and dates. To this end, the Ta of each phantom scan was self-calibrated retrospectively using water diffusion coefficient based on Speedy–Angell relation (30): Ta = 215.05 · ([ADC/D0]1/γ+ 1) − 273.15; γ = 2.063, D0 = 0.1635 mm2/; it ranged between 19.5°C and 25.5°C (±1°C) (Figure 2). For ADC-based Ta, water ADC was fit as a slope of log-signal DWI dependence on b-value up to bmax = 1000 s/mm2 (to minimize SNR bias), and mean ADC value was measured from 15 × 15 mm2 ROI defined on the central vial (V4, Figure 1A). ADC map vertical image “gradients” were observed for one system (online Supplemental Figure 2), with values increasing toward the posterior direction, indicative of phantom warming during the scan, possibly owing to contact from support pads or coil-induced heating. For this system, mean ADC values were used from three ROIs across the water-bath volume away from the posterior coil (see online Supplemental Figure 2).
Four independent, direct water temperature (Tm) measurements (with alcohol-based thermometer, CI = ±0.5°C) were recorded by the sites and indicated ∼0.5°C positive bias of “apparent” Ta-values. (The ADC calculation using b-values up to 2000 s/mm2 resulted in +1°C bias for the same independent Tm-measurements.) Notwithstanding the limited accuracy and precision of the utilized ADC-based Ta-calibration procedure (CI = ±1°C, owing to relatively imprecise water ADC values [±0.03 × 10(−3) mm2/s]), the derived apparent temperature, Ta, was sufficient to differentiate thermal from temporal trends in the measured diffusion kurtosis parameters. Adequacy of the water ADC-based Ta-calibration procedure was confirmed by observation of (expected) linear temperature dependence for ADC of the negative control PVP samples (PVP20%: V3 and PVP40%: V5; Figure 2) not used for internal calibration. Minor excursions from linearity in Figure 2 for ADC values of PVP20% (V3) and PVP40% (V5) vials offset from the isocenter compared with the centrally positioned water (V4) (Figure 1), further confirmed the negligible effect of scanner gradient system bias (20, 21) on inter-scan variability for the measured diffusion parameters.
Four different chemical designs tested for iDKI phantom materials in V1, V2, V6, and V7 (Table 1) exhibited restricted diffusion at high b-values (>1000 s/mm2), with DWI signals sustained above 20% of S0 (Figure 1C), and apparent kurtosis coefficient exceeding negative control bias owing to background noise (Figure 1B, Kapp). All materials allowed achievement of physiologically relevant apparent kurtosis parameter values (Kapp ranges, 0.8–1.7; Table 1). Consistent qualitative observations across sites were that phantom samples apparently degassed after 3–4 weeks from preparation. Less viscous materials formed large air bubbles outside the sample volume, while more viscous materials formed small visible bubbles within the sample volume (Figure 1A; V7). The in-volume microbubbles tended to migrate between test–retest runs, potentially contributing fluctuating measurement errors owing to susceptibility artifact.
BA analysis across all test–retest acquisitions and samples summarized in Figure 3 showed generally good agreement for apparent diffusion kurtosis parameters of all phantoms across centers, compared with those of negative controls. Excluding outliers, BA 95% LOAs were ±0.025 (×10−3 mm2/s) for Dapp and ±0.035 for Kapp. Negligible positive bias of 0.005 was observed for Dapp. This bias and lower repeatability for several Dapp (V4) and Kapp (V7) “outliers” (well outside the LOA) was likely because of finite noise floor interference (V4, high water ADC) and “migrating” air bubble artifacts (V7) for the corresponding test–retest scans.
Finite spread of the mean parameter values of each sample observed along the horizontal axis in Figure 3 reported on cross-system and cross-scan variability, further detailed for individual sample vials in Figure 4A. The scan-to-scan differences in Dapp of negative controls (V3, V4, V5, diamonds) were fully explained by the dependence on scanner ambient temperature (Figure 2; R > 0.97, pR < 1e-5). Absolute bias for Kapp of negative control materials (Figure 2, “x”, right axis) did not exceed 0.1 (without significant temperature dependence). The highest bias, independent of system (magnetic field), was observed for V5 (40%PVP sample) consistent with contrast-to-noise limits for this (low ADC) control. For V4, the bias was inversely dependent on the field strength (higher for 1.5 T Sys2 and Sys3), indicating its SNR origin. All measured Kapp for kurtosis samples (V1, V2, V6, and V7) exceeded negative control (zero kurtosis) bias. The estimated single test–retest 95% CIs (Figure 4B) for iDKI phantom materials ranged between 0.0003 and 0.15 (median 0.015), and (except for V7: Kapp and V4: Dapp outliers) these were not significantly different for Dapp versus Kapp and 1.5 T (Sys2, Sys3) versus 3 T (Sys1, Sys4) systems. CI(Dapp) (Figure 4B, diamonds, left axis) for V1 and V2 has shown minor correlation to measured Dapp values (R = 0.59, 0.57; pR = .07, .09), suggesting negligible contribution of model fit error to test–retest repeatability. For V2 sample, CI(Dapp) was significantly correlated to temperature (R = 0.67, pR = .033), indicating thermal noise sensitivity of this material. No other significant correlations were observed for the material-specific test–retest measurement errors (pR > 0.1).
The mean iDKI parameter values and derived 95% CIs observed across sites and scans are summarized in Table 1 for individual phantom components (including less repeatable “outliers”). Except for the V7 outlier Kapp (95%CI: 0.076), the apparent measurement precision of iDKI phantom parameters (CI[Dapp]: 0.013–0.022 (×10−3 mm2/s) and CI[Kapp]: 0.009–0.017) was as good (or better) than that of the negative controls (0.012–0.034 (×10−3 mm2/s) and 0.013–0.022). The achieved measurement precision was sufficient for analysis of systematic scan-to-scan variability sources for kurtosis phantom parameters (Figure 4A, V1, V2, V6, V7).
Table 2 summarizes correlation between mean parameters and apparent scan room temperature (Ta), day and field variables. The bulk of the significant correlation to magnetic “field” observed for sample V1 (negative for Dapp, and positive for Kapp) apparently originated from the systematic kurtosis parameter differences observed for Sys1 phantom stored in the scanner room versus two other sites using prolonged storage outside of their scanners (Sys2, Sys3, Sys4). Unambiguous interpretation of significant correlation to magnetic field observed for Kapp of V7 sample was not warranted owing to limited precision of the corresponding measurements (Table 1, CI[Kapp] = 0.076).
For vials showing significant thermal and temporal correlations in Table 2, the corresponding parameter dependence is plotted in Figure 5. Temperature dependence was a significant contributor to 10%–15% variation in Dapp (Figure 5A) of V2 and both “parallel” trends of V1 phantom materials. The deviations from linear trends were due to finite precision of self-calibrated Ta values and temporal stability. Marginally significant negative thermal correlation for Kapp of V1 was evidently caused by 2 high-Ta > 24°C measurements for Sys2 and Sys3 (Figure 5B), when this viscous material might not have reached thermal equilibrium. Temporal Dapp and Kapp parameter value trends (Figure 5, C and D) for V1, V2, and V7 materials exhibit initial slope (Dapp V1: 9%, V7: −22%; Kapp V1: −18%, V2: 6%) which settled into relatively stable values after a 3- to 4-week stabilization period (coincidental with observed active sample degassing). In contrast, V6 (PL161 lamellar phantom) diffusion kurtosis parameter values continued to drift toward 50%-higher Kapp and 20%-lower Dapp parameter values over the whole study period (without significant Ta-dependence). Interestingly, Dapp of V7 sample was also nominally independent of temperature. The site-dependent ∼0.2 (×10−3 mm2/s) “fork” in Dapp was observed consistently for thermal and temporal dependence of V1, strongly suggesting involvement of phantom storage conditions and/or low thermal conductivity of this viscous material.
All four of the different chemical designs evaluated for the prototype iDKI phantom (25) provided quantitative diffusion characteristics which could be tuned to a physiologically relevant range of parameters (Kapp > 0.8) observed for in vivo tumor tissue, for example, for head and neck, prostate, and bladder cancers (5, 13–15). The study confirmed feasibility of quantitative iDKI phantoms based on vesicular and lamellar phases of liquid crystal materials of different viscosity, and provided guidance toward a phantom product and multisite quality control protocol with improved precision and reproducibility. Included negative control samples allowed independent characterization of kurtosis bias and supplied internal standards for thermal diffusion-based calibration. Independent of chemical design, all kurtosis phantoms allowed sufficient SNR to avoid noise-bias or noise-limited precision in measured parameters. Phantom material-specific confidence intervals, derived from test–retest repeatability measurements, depended on sample preparation and handling more than on scan SNR, indicating possible improvement venue. The achieved model parameter precision (95% CI) of 1%–3.5% was sufficient to study sources of systematic inter-scan variability related to thermal and temporal stability of the prototype phantom materials. These results will be used in future studies to improve development of the next-generation quantitative iDKI phantom for utilization in multicenter clinical trials.
Among studied chemical designs, CA-BTAC (V2) phantom has shown the most promising characteristics and was least sensitive to sample preparation (6% Dapp change during stabilization stage). Owing to thermal sensitivity of Dapp (∼3%/°C), typical of water-based phantoms (30), this iDKI phantom should be best used with temperature control or monitoring. In contrast, moderately viscous CA-CTAB (V7) phantom has shown thermally stable parameters, but large (22%) change in Dapp during initial stabilization period, as well as limited Kapp precision (9%, likely owing to migrating in-volume microbubbles). The observed field dependence of its Kapp might be related to chemical properties of the material; however, this would require further investigation with improved precision. More viscous lamellar DEC-CTAB (V1) material exhibited moderate (9%–18%) kurtosis parameter changes during stabilization and moderate thermal sensitivity (10%), but was sensitive to prolonged storage and thermal equilibrium conditions, likely due to lower thermal conductivity. The stable kurtosis parameter values were not achieved for PL161 (V6) sample and continually changed over the course of the study, reflecting poor temporal stability of this material.
A limitation of this study was that all phantoms shared among sites were prepared from a single batch of (four families of) materials; repeatability of the batch preparation procedure itself was not evaluated. Temperature was not consistently monitored during scanning, which should be implemented for future multicenter studies, for example, by including in situ thermometer. The phantom T1 and T2 relaxation properties were not studied, and likely do not match in vivo tissue characteristics. However, having longer-than-tissue T2 relaxation times could be desirable for intended use of the kurtosis phantoms to increase the range of accessible b-values for DKI protocol optimization. Furthermore, adjustment of relaxation properties for vesicular phase (predominantly water) materials by adding relaxivity agents should be possible without substantial interference with diffusion characteristics.
Overall, observed apparent phantom diffusion sensitivity to temperature (2%–3%/°C) was similar to free water diffusion (30) and markedly higher than that of apparent kurtosis, consistent with the restricted diffusion origin of the latter. All phantom materials were noted to undergo initial parameter stabilization period of 3-4 weeks following preparation, coincidental with evident sample degassing. The parameter values for vesicular phase materials remained relatively stable after stabilization period. The candidate materials based on more viscous multilamellar vesicle phase, exhibited either poor temporal stability (PL161: V6) or notable dependence on site storage and thermal equilibrium conditions (DEC-CTAB: V1), and hence are not recommended for product iDKI phantom manufacturing. The kurtosis parameter values of CA-CTAB (V7) vesicular material had limited precision (9%) likely owing to formation of in-volume gas microbubbles, but warranted further evaluation after improved preparation due to offered thermal stability.
These observations suggested that sample degassing (eg, by centrifuging) during preparation should be attempted to improve precision and shorten stabilization period, preferably down to <1 week. The future studies should also monitor DKI parameter changes for up to a month for several material batches to evaluate reproducibility of phantom preparation and stabilization time. For use of temperature-sensitive phantoms, temperature monitoring (with DKI parameter calibration) is also recommended for multisite reproducibility studies at ambient temperature. Temperature monitoring could be implemented using an in situ thermometer or a calibrated internal standard, and it would be preferred to temperature control (eg, with ice-water bath) to avoid kurtosis phantom material phase transition (to gel) at lower temperatures.
The present multisite repeatability study has identified the liquid crystal materials based on vesicular phase as best candidates for quantitative iDKI phantom production. Independent of chemical design, the preparation procedure for iDKI phantoms could be improved by including degassing step to enhance repeatability and reduce stabilization period of diffusion kurtosis characteristics. The most promising iDKI phantom design recommended for multisite trials is based on CA-BTAC (V2) vesicular suspension that allowed easy preparation, temporal stability, and independence of storage. Before utilization in multisite studies, this phantom would require temperature calibration and monitoring owing to observed thermal sensitivity of diffusion (similar to other water-based phantoms). Another iDKI phantom design (based on CA-CTAB: V7) with desirable thermal stability, needs to be studied after improved preparation to enhance precision and allowed longer thermal equilibration before scanning to ensure reproducibility for adaption in future longitudinal multicenter clinical trials.