Currently, many clinicians continue to recommend an aggressive initial management approach to all but the patients with the most low-risk papillary thyroid carcinoma (PTC), which usually includes thyroid surgery and radioactive iodine adjuvant therapy (1). Despite this, a more-stratified, risk-adapted initial management approach has been strongly recommended in the recent American Thyroid Association thyroid cancer clinical practice guidelines (2). The recommendations for either a limited thyroid surgery option (thyroid lobectomy without adjuvant therapy) or an active surveillance management approach (serial observation with neck ultrasonography [US] with surgical intervention deferred until documented disease progression) may be less-drastic incremental options for patients with intrathyroidal papillary thyroid cancers thought to be at low risk for disease-specific mortality and recurrence. These treatment options are being offered on the basis of abundance of data showing excellent clinical outcomes following either thyroid lobectomy or active surveillance in properly selected low-risk PTC patients with intrathyroidal disease (1–3). Studies have shown that patients with micropapillary carcinomas having small tumors (size, <1 cm) well confined to the thyroid and without presence of extrathyroidal extension and/or lymph node metastases are good candidates for active surveillance (2). The cumulative risk of extrathyroidal extension and lymph node metastases both increased linearly as the primary tumor increases from 1 to 2 cm (4). It is therefore even more important to ensure that these larger thyroid cancers are confined to the thyroid before considering an observation management approach. US alone is an adequate study for selection of those PTCs with smaller tumors (<1 cm) for active surveillance (5–8). However, for those PTCs with larger tumors (>1 cm up to 2 cm), US has suboptimal sensitivity and specificity in the detection of extrathyroidal extension (9, 10) and cannot reliably detect cervical lymph node metastases deep to the intact thyroid gland or in the infraclavicular, retropharyngeal, and parapharyngeal regions (1). Therefore, some experts have suggested that additional imaging methods be used to verify the absence of disease outside the thyroid when considering a conservative management approach in larger tumors.
Quantitative magnetic resonance imaging (qMRI) is a noninvasive technique that provides images of high spatial resolution with excellent tissue contrast. Quantitative diffusion-weighted imaging (DWI) measures the Brownian motion of water molecules in tumor tissue, which is highly reflective of the cellular organization and membrane integrity (11). DWI has shown promise in the detection, staging, prognosis, and monitoring of thyroid cancers (12–19). Quantitative apparent diffusion coefficient (ADC) metric derived from monoexponential modeling of DWI data, under a Gaussian behavior, using ≥2 b-values (ie, diffusion-weighting factor) (11) reflects tumor cellularity. Recently, clinical relevance for ADC has been shown in assessing extrathyroidal extension in discernable intrathyroidal papillary microcarcinomas (tumor size, <1 cm), an aggressive tumor feature that was limited to identification by surgery only (20).
Le Bihan et al. developed the intravoxel incoherent motion (IVIM) model to describe diffusion in the capillary and tissue compartments separately using multiple b value DWI data set (21, 22). IVIM is a biexponential model, which is based on a Gaussian distribution, and it provides estimates of pseudo-diffusion coefficient (D*), perfusion fraction (f) within the capillary network, and true diffusion coefficient of the tissue (D) metrics (22). Recent studies have shown the utility of IVIM-DWI in clinical oncology (23–31).
Diffusion in biological tissue is hindered and complex and therefore lends itself well to a non-Gaussian (NG) nature, which has been readily observable via noninvasive imaging at high b-values (32, 33). Using multi-b-value DWI data, NG models [ie, diffusional kurtosis (34, 35) and extension of biexponential IVIM with kurtosis, called NG-IVIM (22, 36)] have been developed to account for hindered and restricted diffusion in tumor tissue. The dimensionless imaging metric K characterizes NG diffusion behavior in tissue microstructure. The quantitative metric K obtained from both diffusion kurtosis and NG-IVIM models has shown feasibility to quantify tissue microstructure in head and neck (HN) cancer (35, 36). Given the known microstructural complexity in PTC, we hypothesized that the NG-IVIM may have greater utility than Gaussian models in risk stratification for active surveillance candidates. The purpose of this study was to identify a priori aggressive histological features using NG-IVIM to preclude an active surveillance management approach in patients with PTC, with tumor diameter size 1–2 cm as measured by US.
Materials and Methods
This clinical study was approved by our institutional review board, which was compliant with the Health Insurance Portability and Accountability Act. In total, 24 patients (age, 27–78 years; male/female, 8/16) were enrolled in this prospective clinical trial, before surgery. All patients who underwent the study signed a form of written consent.
DWI Data Acquisition
MRI examinations were performed on a 3-Tesla GE scanner (General Electric, Milwaukee, WI), with a neurovascular phased-array coil and consisted of standard multiplanar (sagittal, axial, coronal) T1- and T2-weighted imaging scans followed by multi-b-value DWI scans. The T1- and T2-weighted MRI scans covered the whole thyroid gland with a field of view (FOV) of 20–24 cm, slice thickness of 5 mm, and acquisition matrix of 256 × 256. The repetition time (TR)/echo time (TE) for T1-weighted scans were 500 milliseconds (ms)/15 ms; and TR/TE for T2-weighted scans were 4000 ms/80 ms.
Multi-b-value DWI images were acquired using a single-shot spin-echo echo planar imaging (SS-SE-EPI) sequence with TR = 4000 ms, TE = minimum (100–110 ms), number of excitations (NEX) = 4, slice thickness = 5 mm, gap = 0 mm, field of view = 20–24 cm, acquisition matrix of 128 × 128, which was zero-filled and reconstructed to 256 × 256 pixels, with 10 b values of 0, 20, 50, 80, 200, 300, 500, 800, 1000, and 1500 s/mm2. Images were acquired using 3 orthogonal diffusion gradient direction. The acquisition minimum TE varied between patients (minimum TE 100–110 ms) because of slight differences in obliquity of the prescription. A calibration scan was performed before multi-b-value acquisition to reduce Nyquist (N/2) ghosting artifacts (20, 37). Fat suppression, shimming, and parallel imaging (acceleration factor = 2) techniques were used to reduce imaging artifacts.
DWI Data Processing
The regions of interest (ROIs) on thyroid glands for the PTCs were drawn on the DWI images (b = 0 s/mm2) by an experienced neuroradiologist (>10 years' experience) using ImageJ (38), in conjunction with the radiological, clinical, and pathological information. All ROIs avoid obvious cystic, hemorrhagic, or calcified portions. All data analyses were performed using an in-house-developed software package, MRI-QAMPER (Quantitative Analysis Multi-Parametric Evaluation Routines) implemented in MATLAB (The MathWorks, Natick, MA). Metric values were estimated on a voxel-by-voxel basis to generate parametric maps, and ROI-averaged values for each quantitative imaging metric were calculated.
The voxel-wise apparent diffusion coefficient (ADC) map within the ROI was calculated from the multi-b-value DWI data, using a monoexponential model given by:equation (2) is equivalent to IVIM model equation (21).
Because multi-b-value DWI images are inherently noisy owing to thermal or physiological factors, a noise-rectified method was used for metric estimation, as detailed elsewhere (36). For image processing, DWI data were fitted using a nonlinear least-square fitting method using MRI-QAMPER (24, 36).
Surgical papillary thyroid tumor specimens after radical thyroidectomy or lobectomy were collected under the supervision of an experienced (>10 years) pathologist. Paraffin-embedded tissue blocks were obtained for each surgically resected tumor specimen and stained with hematoxylin and eosin. The hematoxylin and eosin section of each papillary thyroid tumor was reviewed by the same excising pathologist, using established criteria for evaluating tumor aggressiveness (39, 40). The histopathological characteristics of tumor aggressiveness were evaluated individually using the following 6 features: tall cell variant, necrosis, vascular and/or tumor capsular invasion, extrathyroidal extension, regional metastases, and distant metastases. A tumor identified with the presence of any 1 of these features was considered to be aggressive.
US examinations were performed according to a standard protocol that includes grayscale and color Doppler US assessment of the thyroid bed and cervical lymph nodes in all neck compartments. US reports include information about size, location, and structure of thyroid nodules and cervical lymph nodes. Size was defined as the largest diameter among the 3 dimensions observed. The US studies were performed with Siemens Acuson S2000 or SEQUOIA (Siemens Medical Solutions, Mountain View, CA), or the GE Logiq 9 (GE Healthcare, Little Chalfont, UK) units, using 8- to 15-MHz linear transducers.
Quantitative imaging metrics ADC, D, f, D*, and K from NG-IVIM analysis and US measurement values were reported as ROI-based mean ± standard deviation (SD). To compare metric value differences among groups of PTCs with and without features of aggressiveness, a nonparametric Wilcoxon rank-sum test was used. A Spearman correlation analysis was performed between quantitative imaging metrics. The significance level was set at P ≤ .05.
Finally, the relative percentage change (rc, %) in imaging metrics mean values were calculated as:
Receiver operating characteristic (ROC) curve analysis was performed for each metric to assess its capability to discriminate between PTC groups with and without aggressive features, resulting in area under the ROC curve (AUC) evaluation. Youdon's index was used to estimate the optimal cutoff values for individual metrics (41, 42). Multivariate logistic regression analysis was performed on relevant metrics using a leave-one-out cross-validation (LOOCV) method for unbiased assessment of the modeling.
Patient characteristics are summarized in Table 1. Of the 24 patients, 13 patients were found to have locoregional metastases by preoperative US imaging. Based on surgical pathology analysis, all 24 patients had PTCs, including 6 patients with the tall cell variant, 1 patient with vascular and/or capsular invasion, 9 patients with extrathyroidal extension, and 16 patients with locoregional metastases. The mean size of the lesion based on US was 16 ± 6 mm and ranged from 6–26 mm.
Figure 1 shows a representative plot of signal intensity decay curve as a function of the b-values (s/mm2) obtained from a patient with aggressive feature of extrathyroidal extension (ETE) confirmed at surgical pathology.
Figures 2 and 3 show NG-IVIM metric maps overlaid on the DWI images from a representative patient with PTC with aggressive tumor features (female; 28 years; US tumor maximum diameter, 2.1 cm) and a representative patient with PTC without aggressive tumor features (female; 48 years; US tumor maximum diameter, 2.1 cm), respectively. It is interesting to note that maximum tumor diameter in preoperative US was the same for both tumors shown in Figures 2 and 3. However, at surgical pathology, the tumor with aggressive features was found to be in the size range of >2 cm (Figure 2), while tumor with nonaggressive feature was in the size range of 1–2 cm (Figure 3).
Tumors with aggressive features (tall cell variant, necrosis, vascular and/or tumor capsular invasion, ETE, regional metastases, or distant metastases) had significantly lower ADC and D values and higher f values than tumors without aggressive features (P < .05) (Figure 2), whereas K and D* values were not significantly different (P > .05) for the 2 groups (Table 2).
|US Tumor Size||<1 cm (n = 3)||1–2 cm (n = 14)||>2 cm (n = 7)|
|Aggressive features on US||YES(n = 2)||NO(n = 1)||YES(n = 6)||NO(n = 8)||YES(n = 5)||NO(n = 2)|
|Aggressive features on pathology||YES(n = 3)||NO(n = 0)||YES(n = 10)||NO(n = 4)||YES(n = 5)||NO(n = 2)|
|ADC × 10−3 (mm2/s)||(1.2 ± 0.7)||–||(1.32 ± 0.27)a||(1.9 ± 0.5)a||(1.7 ± 0.4)||(2.03 ± 0.06)|
|D × 10−3 (mm2/s)||(1.4 ± 0.7)||–||(1.27 ± 0.25)a||(2.1 ± 0.6)a||(1.7± 0.6)||(2.20 ± 0.08)|
|D* × 10−3 (mm2/s)||(2.61 ± 0.62)||–||(2.84 ± 0.06)||(2.95 ± 0.06)||(2.7 ± 0.3)||(2.98 ± 0.02)|
|f||(0.17 ± 0.05)||–||(0.21 ± 0.06)||(0.16 ± 0.05)||(0.18 ± 0.05)||(0.10 ± 0.02)|
|K||(0.7 ± 0.6)||–||(0.70 ± 0.26)||(0.48 ± 0.29)||(0.71 ± 0.28)||(0.64 ± 0.15)|
Out of the 24 patients, 14 patients were in the critical size range (1–2 cm), and ADC and D were significantly different (Table 2), differentiating between tumors with (n = 10) versus without (n = 4) aggressive features (P < .05). The ADC values were 1.3 ± 0.3 × 10−3 mm2/s vs 1.9 ± 0.5 × 10−3 mm2/s for tumors with and without aggressive features, respectively. The D values were 1.3 ± 0.3 × 10−3 mm2/s vs 2.1 ± 0.6 × 10−3 mm2/s for tumors with and without aggressive features, respectively. The K, D*, and f metrics were not significantly different in this cohort (P > .05).
Figure 4 boxplot compares the quantitative imaging metrics mean values for ADC, D, and US-measured tumor size (mm) between tumors with and without aggressive features. The absolute relative percentage change (rc, %) in the quantitative imaging metric ADC, D, K, D*, and f metric values for tumors with aggressive features were 31%, 40%, 46%, 7%, and 31% respectively, in comparison to tumors without aggressive features.
Figure 5 displays the scatter plot between NG-IVIM estimates of two quantitative imaging metrics D and K. The Spearman rank-order correlation coefficient (ρ) was −0.46 (P < .05), indicating a significant correlation between the D and K.
Figure 6A shows the estimated ROC curves for quantitative imaging metrics ADC, D, and K. Using ROC analysis, the best cutoff values of ADC, D, and K that discriminate between aggressive PTCs with and without aggressive features were determined as follows: ADC = 1.79 × 10−3 mm2/s, D = 1.35 × 10−3, and K = 0.68. The sensitivity, specificity, and AUC obtained from the ROC curve were 100%, 75%, and 0.875, respectively, for ADC; 80%, 100%, and 0.95, respectively, for D; and 70%, 75%, and 0.725, respectively, for K. The AUC is the highest for metric D, followed by metrics ADC and K. Figure 6B resulted from logistic regression on combined 2 metrics (ADC and D) and 3 metrics which included K based on the LOOCV method. Sensitivity, specificity, and AUC obtained from the LOOCV analysis combining 2 and 3 metrics were as follows: 90%, 75% and 0.70 and 80%, 75%, and 0.65, respectively.
To the best of our knowledge there are no published studies that have leveraged the use of biexponential NG-IVIM modeling using multi-b-value DWI data sets to stratify PTCs into tumor groups with and without aggressive features. The quantitative imaging metrics ADC, D, and K exhibit promise as surrogate biomarkers for aggressiveness in patients with PTC, following appropriate validation.
Previously, Lu et al. using monoexponential modeling of quantitative DWI data stratified PTCs into tumor groups with and without ETE, one of the multiple aggressive features, thereby obtaining significantly lower mean ADC values for tumors with ETEs than without (1.53 ± 0.25) × 10−3 [mm2/s] vs (2.4 ± 0.7) × 10−3 [mm2/s] (20). Previously ETE was identified by surgery only (8, 45). Hao et al., also using DWI, stratified PTCs with and without ETEs, thereby showing significant lower median ADC values for tumor with ETE features (1.41 ± 0.29) × 10−3 [mm2/s] vs (1.53 ± 0.29) × 10−3 [mm2/s] (46). In the present study the cut-off value of ADC to discriminate PTCs with and without aggressive features was 1.79 × 10−3 mm2/s and is consistent with the previous studies by Lu et al. and Hao et al., 1.85 × 10−3 mm2/s and 1.89 × 10−3 mm2/s, respectively (20, 46).
Recently biexponential modeling (IVIM) analysis using multi-b-value DWI data have shown clinical utility in several cancers, including prostate and head and neck (30, 47–49). Valerio et al. have shown that ADC and D values are significantly lower in prostate cancer tissue compared with healthy tissue (0.76 ± 0.27 × 10−3 [mm2/s] vs 0.99 ± 0.38 × 10−3 [mm2/s]) (47). In addition, Barbieri et al. found that ADC and D differ significantly between high- and low-grade prostate cancer lesions (0.76 ± 0.27 × 10−3 [mm2/s] vs 0.99 ± 0.38 × 10−3 [mm2/s]) (48). The clinical utility of multi-b-value DWI is being tested in cancers in the head and neck region, including the thyroid gland (30, 31, 49). Shen et al. investigated the feasibility of using IVIM to detect radiation changes of normal-appearing parotid glands in patients with differentiated thyroid cancer after radioiodine therapy (49). In a small study of 8 healthy volunteers, Becker et al. used an IVIM-derived imaging metric to establish a comprehensive description of tissue properties of healthy thyroid tissue (50). The IVIM imaging metric D was shown to be significantly different between complete responders (the change between pre- and intratreatment week 3 was from 0.67 ± 0.17 × 10−3 mm2/s to 0.98 ± 0.28 × 10−3 mm2/s) and noncomplete responders (the change between pre- and intratreatment week 3 was from 0.59 ± 0.10 × 10−3 mm2/s to 0.72 ± 0.03 × 10−3 mm2/s) in patients with head and neck squamous cell carcinoma treated with radiotherapy (31). For tumors with hindered and restricted diffusion, NG-IVIM modeling analysis from multi-b-value DWI, as developed by Lu et al., has shown to be a better-fitting model in head and neck region (36). This model is used in the present study for the first time in the thyroid region.
The findings from 14 patients with PTC with tumor diameter 1–2 cm (as measured by US) emphasize the role of NG-IVIM DWI in differentiating this sub group. Preoperative US could identify 6 out of 14 patients with aggressive features, while NG-IVIM DWI indicated 11 patients. As ground truth, there were 10 patients with aggressive tumor features determined by pathology, our reference standard (Table 2). Therefore, NG-IVIM could correctly identify all 10 patients with aggressive tumor features confirmed by pathology, whereas US correctly identified only 6 patients. US is the imaging modality most commonly used to identify and monitor locoregional disease progression and recurrence in thyroid cancer. However, US is unable to preoperatively identify features such as tall cell variant, necrosis, vascular, and/or tumor capsular invasion or distant metastases (5). NG-IVIM DWI were able to correlate nonaggressive tumor features in 3 out of 4 patients, whereas US overestimated nonaggressiveness in 8 patients. These data strongly suggest that US and MRI are complementary and should be used in combination for patients with tumor size in the range of 1–2 cm. This finding is of key clinical importance for treating physicians who are considering active surveillance for said patient population.
Quantitative NG-IVIM DWI and its derived diffusion and perfusion imaging biomarkers have shown promise in this study of patients with PTC when grouped on the basis of different tumors sizes from preoperative US measurements. In addition, for US-measured tumors sized in the range of 1–2 cm, substantial difference was observed in rc (%) in the NG-IVIM-derived metrics between the 2 groups. D is the true diffusion coefficient metric and a surrogate biomarker of tumor cellularity with 40% change, while metric K is considered as an index of tissue microstructure related to hindered and restricted diffusion with 46% change observed for tumors with aggressive features on comparison to tumors without aggressive feature. The rc (%) for imaging metrics f and D* were 31% and 7%, and these imaging metrics remain exploratory in nature as their biological meaning has yet to be fully understood. In the present study, K was not necessarily independent of D for all tumors but a weak correlation coefficient between these 2 quantitative imaging metrics suggested that K might provide additional information related to tissue microstructure. Similar results have been reported previously using NG analysis of DWI in head and neck squamous cell carcinoma (51).
In the present study, for all US-based tumor sizes, the univariate analysis showed the most favorable predictive power with D (AUC = 0.95). However, the AUC is lower for both combinations of the metrics in the cross-validation–multivariate analysis, implying cross-validation is necessary to build the predictive model for more realistic and unbiased assessment. The decrease in AUC between 2- and 3- metric models, is due to the metric K which may have discriminatory power for US-based tumor size, in the range of 1–2 cm only. For differentiation between the 2 PTC groups, the 2-metric model appears to be the model of choice.
These findings indicate that the quantitative imaging metrics derived from NG-IVIM modeling can provide important risk stratification information and additional insights into potential tumor behavior that cannot be gained from US evaluation alone. As consideration is being given to extend active surveillance to tumors larger than 1 cm, it is increasingly important to develop additional noninvasive tools to help clinicians risk-stratify these slightly larger tumors.
There are several known limitations in this study. First, further investigation is needed in those cohorts with tumors diameters >2 cm and <1 cm. Although no active surveillance is needed for tumors that are >2 cm, it is important to identify aggressive features in tumors that are <1 cm, as has been shown by Lu et al. for papillary microcarcinomas with ETE features (20). Second, a validation study with a larger cohort of patients with PTC is necessary to confirm our initial findings for use in clinical trials. Finally, DWI acquisition using SS EPI suffers from susceptibility artifacts owing to voluntary and involuntary bulk motion in the thyroid region (52). Modified sequences, such as reduced field of view, can help obtain images with fewer distortions (53).
In conclusion, quantitative imaging biomarkers (ADC, D, and K) derived from NG-IVIM DWI could be used to noninvasively identify tumors with aggressive histological features to preclude an active surveillance management approach in patients with PTC with primary tumor diameters ranging between 1–2 cm.