By Joëlle Taieb, Bruno Mathian, Françoise Millot, Marie-Claude Patricot, Elisabeth Mathieu, Nicole Queyrel, Isabelle Lacroix, Claude Somma-Delpero, Philippe Boudou


Background: Commercially available testosterone immunoassays give divergent results, especially at the low concentrations seen in women. We compared immunoassays and a nonimmunochemical method that could quantify low testosterone concentrations.


Methods: We measured serum testosterone in 50 men, 55 women, and 11 children with use of eight nonisotopic immunoassays, two isotopic immunoassays, and isotope-dilution gas chromatography–mass spectrometry (ID/GC-MS).


Results: Compared with ID/GC-MS, 7 of the 10 immunoassays tested overestimated testosterone concentrations in samples from women; mean immunoassay results were 46% above those obtained by ID/GC-MS. The immunoassays underestimated testosterone concentrations in samples from men, giving mean results 12% below those obtained by ID/GC-MS. In women, at concentrations of 0.6–7.2 nmol/L, 3 of the 10 immunoassays gave positive mean differences >2.0 nmol/L (range, −0.7 to 3.3 nmol/L) compared with ID/GC-MS; in men at concentrations of 8.2–58 nmol/L, 3 of the 10 immunoassays tested gave mean differences >4.0 nmol/L (range, −4.8 to 2.6 nmol/L).


Conclusion: None of the immunoassays tested was sufficiently reliable for the investigation of sera from children and women, in whom very low (0.17 nmol/L) and low (<1.7 nmol/L) testosterone concentrations are expected.


The measurement of circulating testosterone is clinically relevant in the investigation of androgen disorders in humans (1). In men, testosterone analysis is used to evaluate the endocrine activity of the testis. In association with gonadotropin determination, the circulating testosterone concentration provides information concerning the origin of testicular dysfunction (2)(3). Measurement of testosterone is also recommended in the monitoring of patients with metastatic prostate cancer treated with gonadotropin-releasing hormone analogs and/or by antiandrogen therapy, and as a means of checking the that testosterone concentration has decreased to the castration range (4)(5)(6)(7). In women, testosterone is frequently measured as part of the investigation of alopecia, acne, and/or hirsutism (8)(9), although testosterone concentrations remain within the reference interval in 50% of cases (10). Simple testosterone measurements have been shown to have predictive value for the detection of androgen-secreting tumors of ovarian origin (11) and have also been used to define the minimum drug dose required to abolish androgen secretion in hyperandrogenic women (12). This steroid has therefore been closely monitored in the follow-up of patients with congenital adrenal hyperplasia resulting from 21-hydroxylase deficiency (13)(14).


In children, the circulating testosterone concentration is determined principally for the diagnosis, treatment, and gender assignment of newborns or young infants with ambiguous genitalia (15). The testosterone concentration can be used to determine pubertal stage in association with physical examination and gonadotropin determination and in the follow-up of children with precocious (16)(17)(18)(19) or delayed puberty (20)(21).


In all of these situations, a highly sensitive and very accurate testosterone assay is required. Traditional testosterone immunoassays with tritiated testosterone involve extraction and chromatography before testing to eliminate proteins (e.g., binding globulins) and structurally related molecules, which may interfere with the results of the immunoassay (22). These procedures are mostly carried out in referral laboratories. Most of the conventional testosterone assays developed to date are based on iodinated, labeled testosterone and on manual or partially automated procedures, and they are performed with or without serum extraction. In the last 10 years, automated analyzers have become available in the field of endocrinology. Immunoassays based on competition are performed on nonextracted samples with use of nonisotopic labeled testosterone. These analyzers are easy to use, with short cycle times, and are widely used in clinical laboratories. We evaluated the accuracy of these testosterone assays by screening 116 serum samples from men, women, and children with eight direct nonisotopic and two direct isotopic immunoassays (RIA). The testosterone concentrations obtained in each immunoassay were compared with a reference method based on isotope-dilution gas chromatography–mass spectrometry (ID/GC-MS) (23)(24).


Materials and Methods

Blood samples were taken from patients of the Endocrinology Department of Saint-Louis Hospital (Paris, France). Consecutive untreated individuals were studied during 1 year. The following conditions were absent in these individuals: a previous endocrine investigation, taking any treatment that could interfere with hormonal investigation, and hepatic, renal or cardiac injuries. The main reasons for consulting were alopecia, impaired libido, and sexual dysfunction in the men; acne and/or alopecia and/or hirsutism with or without menstrual cycles abnormalities in the women; and delayed puberty or isolated precocious puberty in the children. A testosterone measurement [extraction-partition celite chromatography/RIA (25)], which was part of the clinical investigation, had been included in the endocrine assessment and served as a basis to group individuals into equal distributions between men and women with low, medium, and high testosterone concentrations that covered the wide range of concentrations found in clinical practice. Patients with sufficient serum volumes to be tested by all immunoassays and ID/GC-MS were included in the study. We constructed a bank of 116 serum samples comprising 50 sera from men, 55 sera from women, and 11 sera from children. Individuals were allocated to three groups: adult men [mean (SD) age, 44.7 (15.2) years; range, 19–71 years], adult women [mean (SD) age, 27.2 (6.6) years; range, 15–45 years], and children [six boys 6–12 years of age and five girls 2–10 years of age; mean (SD) age, 8.45 (2.90) years]. Blood samples were collected in glass Vacutainer® tubes (Becton Dickinson) without anticoagulant or preservative, and serum was obtained after centrifugation. All samples were dispensed into 300-μL (or 2 mL for ID/GC-MS) aliquots in screw-cap Eppendorf® tubes (Becton Dickinson) and stored at −80 °C. Samples were transported on dry ice to the various participating laboratories. Serum samples were stored at −20 °C in the laboratories, thawed at room temperature before use, and tested within 8 h of thawing. None of the samples was frozen and thawed repeatedly.

All investigations conformed to the ethical standards of the Helsinki Declaration.


immunologic methods
The study began simultaneously in all laboratories in 2001 (our data reflect the performance of the methods at the time of the exercise). Ten manufacturers agreed to participate in the evaluation, providing assay reagents and, when necessary, installing analyzers. The study included eight automated multianalyte analyzers: Architect i2000 (Abbott Laboratories), ACS-180, Immuno-1 (Bayer Diagnostics), Vidas (Bio-Mérieux), Immulite 2000 (Diagnostics Products Corp.), Vitros Eci (Ortho Clinical Diagnostics), AutoDelfia (Perkin-Elmer), and Elecsys 2010 (Roche-Boehringer-Mannheim). We also tested two direct isotopic immunoassays (RIA): Immunotech (Beckman-Coulter), the most commonly used in France (according to data assessed by the French National Interlaboratory Quality Standard Control), and Coat-A-Count DPC (Dade Behring), one of the most frequently referenced assays in the literature [see Refs. (26)(27)(28)(29)(30)(31)(32)]. The main analytical characteristics of these immunoassays are summarized in Table 1⇓ .


Table 1.
Main analytical characteristics, as defined by manufacturers, of the eight nonisotopic and two isotopic testosterone immunoassays tested in the present study.


Samples were assayed either singly for nonisotopic methods according to the manufacturer’s instructions or in duplicate for isotopic and ID/GC-MS methods. The mean testosterone concentration was considered for comparison when isotopic and ID/GC-MS methods were compared. All samples were assayed in each of the laboratories, using assay reagents from the same lot.


id/gc-ms method

Internal standard


Labeled testosterone ([3,4-13C2]testosterone; Commissariat à l’Energie Atomique, Gif sur Yvette, France) was used as the internal standard. The purity and 13C enrichment of this standard were confirmed by ID/GC-MS and nuclear magnetic resonance spectroscopy (33).


Sample preparation


We added 1–10 ng of labeled testosterone ([3,4-13C2]testosterone) to 1.0 mL of serum from men or to 2.0 mL of serum from women and children. The samples were mixed and incubated for 15 min at 37 °C. The sample volume and the amount of internal standard were adjusted according to the testosterone concentration measured previously by extraction-partition celite chromatography/RIA (25) to obtain a ratio of unlabeled/labeled testosterone between 0.5 and 1.0. Samples were mixed, incubated for 15 min at 37 °C, and extracted with 5 mL of diethyl ether by rapid vortex-mixing for 2 min. After centrifugation (15 min at 3000g) and freezing of the aqueous layer, the organic layer was decanted, evaporated to dryness under nitrogen at 45 °C, and extracted. The extract was dissolved in 200 μL of 50 mL/L ethyl acetate in isooctane and purified on a column (6 mm internal diameter) containing 0.45g of ethylene-glycol-impregnated celite (Prolabo). After the reconstituted extract was loaded on the column, the column was washed with 8 mL of isooctane, and the eluent was discarded. Then testosterone was then eluted with 3 mL of 150 mL/L ethyl acetate in isooctane. The solvent was evaporated under nitrogen (45 °C), and the dry residue was dissolved in 50 μL of heptafluorobutyric anhydride and 200 μL of anhydrous acetonitrile. The resulting solution was incubated for 30 min at 37 °C (derivatization). The solvent was evaporated under nitrogen, the residue was dissolved in 20 μL of hexane, and 2–4 μL of the solution was injected into the ID/GC-MS instrument.

Testosterone was quantified by monitoring the mass/charge (m/z) ions corresponding to the derivatives molecular ions of unlabeled (m/z = 680) and [3,4-13C2]testosterone (m/z = 682) internal standard. The calibration curve was obtained from the peak-area ratios of the m/z 680/682 ions from three solutions containing 0.0, 2.5, and 5.0 ng of unlabeled testosterone amounts and 5.0 ng of labeled testosterone, corresponding to unlabeled/labeled testosterone concentration ratios of 0.0, 0.5, and 1.0. We corrected for the overlap between the unlabeled testosterone peak area at 682 and the labeled testosterone peak area at mass 680 on calibration curves and for samples, as described previously (34)(35). Testosterone calibrators and patients’ samples were analyzed in duplicate, and the mean value was used for data analysis.

A plot from the weighted Deming regression analysis, performed in samples from women, between extraction-partition celite chromatography/RIA and ID/GC-MS is presented in Fig. 1⇓ . Five samples were discrepant between the two methods concerned. The results obtained with extraction-partition celite chromatography/RIA vs those obtained with ID/GC-MS for these five samples were 3.12 vs 2.54, 2.77 vs 2.11, 3.29 vs 2.52, 3.12 vs 2.49, and 2.94 vs 2.05 nmol/L.



Figure 1.
Plot from the results of weighted Deming regression analysis of the testosterone results for comparison between ID/GC-MS and extraction-chromatography/RIA for samples from women (n = 55).
Equation for the regression line: y = 1.162 (0.081)x − 0.134 (0.092) nmol/L; r = 0.89. The SE of the slope and intercept are in parentheses. If the highest value is omitted, the equation becomes: y = 1.17x − 0.15 nmol/L (r = 0.75).


ID/GC-MS procedure.


ID/GC-MS analysis was performed with a Carlo Erba chromatograph coupled to a quadrupole mass spectrometer (QMD 1000; Thermofinnigan France) system equipped with a 30-m fused-silica capillary column (DB5; 0.32 mm i.d.; 0.25 mm film thickness; J&W Scientific) with helium as the carrier gas (flow rate, 1.2 mL/min). We injected of 2–4 μL of sample into the apparatus at 275 °C, using a mobile needle injector. Column temperature was maintained at 220 °C for 13 min and was then increased to 265 °C at a rate of 10 °C/min. The ion source was operated in the electron impact mode with 70 eV electron energy, and the photomultiplier was set to 700 V. Molecular ions of the derivatives of testosterone (m/z 680) and [13C]testosterone (m/z 682) were monitored.


Analytical characteristics of ID/GC-MS.


The detection limit was <0.15 nmol/L (calculated on the basis of a signal-to-noise ratio >6). Chromatograms from a child (testosterone concentration, 0.17 nmol/L) and a woman samples (testosterone concentration, 0.80 nmol/L) are depicted in Fig. 2⇓ . The interrun CVs were 15% at a mean concentration of 0.20 nmol/L (n = 8), 3.9% at a mean concentration of 0.74 nmol/L (n = 7), 3.6% at a mean concentration of 2.67 nmol/L (n = 6), 2.8% at a mean concentration of 17.59 nmol/L (n = 14), and 2.1% at a mean concentration of 36.65 nmol/L (n = 19). The accuracy was determined by a dilution test and by calculating analytical recoveries of added testosterone to serum aliquots from two pools. All of these specimens were analyzed in triplicate. The dilution test (1.9–34 nmol/L) yielded the following linear regression equation: y (experimental values) = 0.997x (expected values) − 0.113 nmol/L (r2 = 0.9997). The recoveries (%) of testosterone added to aliquots from two serum pools (0.346 and at 1.250 nmol/L, respectively) are shown in Table 2⇓ .


Figure 2.
Testosterone chromatograms obtained by ID/GC-MS.
(A), sample from a child (0.17 nmol/L testosterone); (B), sample from a woman (0.80 nmol/L testosterone). The graphs represent the m/z ions corresponding to the derivative molecular ions of unlabeled (m/z = 680) and 13C-labeled testosterone (m/z = 682) internal standard. The testosterone peak is around 8.50 min.


Table 2.
Recoveries of testosterone added to aliquots from two serum pools tested at 0.346 and 1.250 nmol/L, respectively.


statistical analysis


The testosterone concentrations measured by each assay and ID/GC-MS for samples from men and women were compared with use of the nonparametric Wilcoxon matched-pairs signed-ranks test. The weighted Deming regression analysis (reference method as the x values, and the assay tested as the y values) was performed using samples taken from men (n = 50) and women (n = 55) separately (36). Bland–Altman analysis (37), based on evaluation of the difference between the methods (testosterone concentration measured by assay tested − testosterone concentration measured by ID/GC-MS) against their mean [(testosterone concentration measured by ID/GC-MS + testosterone concentration measured by assay tested)/2] was used to assess the between-assay difference. The limits of agreement (estimated by the mean difference ± 2 SD of the differences) and the percentages of outliers were also calculated. The ratio method (result from assay tested/result from ID/GC-MS vs result from ID/GC-MS) (38) was performed to assess the distribution of the values obtained by each assay compared with ID/GC-MS and the magnitude of the discrepancy.


Results


We analyzed 116 serum samples, including 55 from women, 50 from men, and 11 from children. Samples with testosterone concentrations below the detection limit of the immunoassay tested (Architect i2000, n = 1; Vidas, n = 2; Immulite 2000, n = 2; Elecsys 2010, n = 2; Immunotech, n = 1; Coat-A-Count DPC, n = 4) and samples with insufficient serum volume when assayed (Architect i2000, n = 5; Vidas, n = 2) were not included in the data analysis. Testosterone data from the samples from children are individually shown for each immunoassay and ID/GC-MS (Table 3⇓ ).


Table 3.
Individual testosterone concentrations, in samples from male (samples 1–6) and female (samples 7–11) children (age range, 2–12 years), as measured by ID/GC-MS method and 10 testosterone immunoassays.


The testosterone values [median, mean, SD, and concentration range (nmol/L)]obtained by ID/GC-MS and immunoassays tested samples from women and men and the results of the statistical analysis of the data are shown in Table 4⇓ . The range, median, mean, and SD were highly dispersed among immunoassays in both adult groups, and differences compared with ID/GC-MS were large. All of the methods but Vidas and Vitros Eci in women and Architect i2000, Immuno-1, and Coat-A-Count DPC in men showed significantly different median testosterone values compared with ID/GC-MS. It should be stated that this statistical analysis compared the differences between medians and did not address the scatter of the results between each immunoassay and ID/GC-MS. Equations from the weighted Deming regression analysis, performed in women and men separately, are presented in Table 5⇓ , and the plots are presented in Fig. 1⇑ of the Data Supplement accompanying the online version of this article athttp://www.clinchem.org/content/vol49/issue8/.


Table 4.
Testosterone values obtained by 10 testosterone immunoassays and ID/GC-MS for samples from men and women.1


Table 5.
Comparison between testosterone results obtained by ID/GC-MS and 10 immunoassays for samples from men (n = 50) and women (n = 55), using the weighted Deming regression analysis (36): equation for the regression line, SD of slope and intercept, and correlation coefficient.1


For samples from men, the slopes and intercepts ranged from 0.79 (Immulite 2000) to 1.10 (Immuno-1) and from −3.02 nmol/L (Elecsys 2010) to 1.98 nmol/L (AutoDelfia), respectively. All of the immunoassays but AutoDelfia (r = 0.86) were well correlated with ID/GC-MS (r = 0.92–0.97), although the agreement between testosterone concentrations measured by each immunoassay and ID/GC-MS was not as good as the line of best fit (y = x; Fig. 1⇑ in the online Data Supplement). Some immunoassays (Vidas, Immulite 2000, Vitros ECi, Elecsys 2010, and RIA Immunotech) clearly underestimated testosterone concentrations over the concentration range. In the samples from women, all of the immunoassays were less well correlated with ID/GC-MS [r values ranged from of 0.57 (Immulite 2000) to 0.89 (Coat-A-Count DPC)]. The slopes and intercepts ranged from 0.85 (Elecsys 2010) to 3.28 (AutoDelfia) and from −2.01 nmol/L (AutoDelfia) to 0.53 nmol/L (ACS-180), respectively. We observed a marked discrepancy and a wide dispersion of the results [according to the line of best fit (y = x); Fig. 1⇑ in the online Data Supplement] for testosterone concentrations measured by the Immulite 2000 and AutoDelfia.

Using Bland–Altman analysis, we calculated mean differences and 2 SD of the differences for samples from men and women; the results are shown in Table 4⇑ . In samples from men, mean differences ranged from −4.79 nmol/L (Elecsys 2010) to 2.55 nmol/L (ACS-180). Compared with ID/GC-MS, 70% of the immunoassays tested underestimated testosterone concentrations. The highest magnitudes of disagreement were observed for the Vidas, Elecsys 2010, and RIA Immunotech (negative mean differences >4 nmol/L). In samples from women, the mean differences ranged from −0.69 nmol/L (Elecsys 2010) to 3.29 nmol/L (Immulite 2000). Compared with ID/GC-MS, 70% of the immunoassays tested overestimated testosterone concentrations. The highest magnitudes of disagreement were for the Immulite 2000, AutoDelfia, and ACS-180 (positive mean differences >2 nmol/L). When the limits of agreement, as calculated by the Bland–Altman analysis (mean difference ± 2 SD of the differences), were applied, 8 of 10 immunoassays and 5 of 10 immunoassays gave percentages of values higher than the 5% usually defined as the value that should be found outside the mean difference ± 2 SD of the differences (Table 4⇑ ) for samples from men and women, respectively.

The results for samples from men and women analyzed by the ratio method were clearly separated and are shown in Fig. 3⇓ . The ACS-180 and AutoDelfia displayed a positive bias over the female and male testosterone concentration ranges tested. Conversely, Elecsys 2010 was the sole method that clearly underestimated testosterone concentrations over the female and male concentration ranges tested. The Immulite 2000 and RIA Immunotech overestimated results in the female concentration range tested and displayed a negative bias over the male concentration range tested, whereas the Architect i2000, Immuno-1, and Coat-A-Count DPC overestimated testosterone in the female concentration range and displayed results evenly distributed close to a ratio of 1.0 for male testosterone concentrations. The results for the female testosterone concentration range tested were widely dispersed, particularly for AutoDelfia and Immulite 2000.



Figure 3.
Testosterone concentrations in women (n = 55; ○) and men (n = 50; •) obtained by 10 immunoassays and ID/GC-MS method as evaluated by the ratio method (38).
The x axis represents the testosterone concentration measured by ID/GC-MS (nmol/L); the y axis represents the ratio testosterone concentration by immunoassay/testosterone concentration by ID/GC-MS. Each point represents an individual value. The vertical dotted line separates the testosterone concentrations obtained for individual women and men. Samples in which testosterone was below the detection limit defined by the manufacturer were excluded: Architect i2000, n = 1; Vidas, n = 2; Immulite 2000, n = 2; Elecsys 2010, n = 2; Immunotech, n = 1; Coat-A-Count DPC, n = 4. Five samples for the Architect i2000 and two for the Vidas were excluded because the volume of the serum sample was too small.


We investigated the ability of the immunoassays tested to adequately distinguish between hypo- and normogonadal men, using 9 of the 50 samples from men with testosterone concentrations of 8.33–11.97 nmol/L as measured by ID/GC-MS (lower limit of the reference interval for this method was 10.40 nmol/L). When measured by ACS-180 (8.58–18.59 nmol/L) and AutoDelfia (7.63–15.43 nmol/L), 78% and 56%, respectively, of these samples were above the lower limit of the reference interval for ID/GC-MS. When the lower limits defined for the ACS-180 and AutoDelfia (8.40 and 8.70 nmol/L, respectively; Table 1⇑ ) were used, 100% and 89%, respectively, of the hypogonadal men were classified as normogonadal. In addition, we evaluated whether the immunoassays tested could clearly distinguished women with testosterone concentrations within the reference interval for women from women with abnormal testosterone concentrations. The results of the 55 samples from women group measured by ID/GC-MS were as follows (testosterone concentrations >2.55 nmol/L, the upper limit of our ID/GC-MS method, were considered as abnormal): testosterone <2.06 nmol/L, n = 25; testosterone between 2.06 and 2.55 nmol/L, n = 17; and testosterone >2.55 nmol/L, n = 13. On the basis of these criteria, for the 42 samples with testosterone concentrations below the upper limit of the reference interval by ID/GC-MS, the immunoassays yielded abnormal results for 85.7% (Immulite 2000), 81.0% (AutoDelfia), 73.8% (ACS-180), 64.3% (Architect i2000), 40.5% (RIA Immunotech), 37.5% (Coat-A-Count DPC), 33.0% (Immuno-1), 19.0% (Vitros ECi), 14.0% (Vida), and 7.0% (Elecsys 2010) of the samples. When we analyzed the data using the upper limit of the reference interval for each immunoassay, the percentage of samples with abnormal testosterone concentrations remained comparable to those observed when we applied ID/GC-MS upper limit. This was also true for the Immulite 2000 (48%), AutoDelfia (69%), and Vidas (12.5%) assays, whose upper limits were higher than that of the ID/GC-MS method. Of the 13 samples in which ID/GC-MS measured abnormal testosterone concentrations, the immunoassays yielded results within the ID/GC-MS reference interval for 85.0% (Elecsys 2010), 54.0% (Vitros ECi), 38.5% (Vidas), 23.0% (RIA Immunotech), 23.0% (Immuno-1), 8.0% (Coat-A-Count DPC), 0.0% (ACS-180), 0.0% (Architect i2000), 0.0% (AutoDelfia), and 0.0% (Immulite 2000) of the samples. In addition, when the upper limits of the Vidas assay were used to analyze these 13 samples, the percentage of women with abnormal testosterone concentrations increased from 38.5% to 69.0%.


Discussion

The methods used to measure testosterone differ considerably, and high interassay variability has been observed for assays that directly (i.e., without sample extraction) measure testosterone in samples from men and women (31). Commercially available direct testosterone assays generally overestimate the steroid concentration (27). This is true for samples from children (particularly newborns and young infants) and for samples from women (22)(27)(39). Differences among RIAs using iodinated testosterone have also been described, to a lesser extent, for samples from men (31). In contrast, RIA procedures that use tritiated testosterone combined with extraction and chromatographic purification steps to eliminate binding globulins and structurally related molecules give lower values for samples from children (22), and testosterone concentrations correspond to those measured by ID/GC-MS (27)(40)(41). In samples from women, testosterone values were not the same between extraction-chromatography/RIA and ID/GC-MS, but the differences were comparable [Ref. (42) and our data]. Nonetheless, these data diverge from the results reported by Fitzgerald et al. (43), who in a reply to a letter to the editor (30) showed a lack of correlation between negative chemical ionization/GC-MS and extraction-chromatography RIA in samples from women. This lack of correlation is of concern, as noted by these authors. ID/GC-MS does not suffer from cross-reactivity or matrix effects and is used as the reference method for the evaluation of steroid immunoassays (23)(24). ID/GC-MS and tritiated testosterone RIAs are the principal methods used in referral laboratories.

Procedures based on nonisotopically labeled analytes adapted for use in automated analyzers are now available and are widely used in routine clinical laboratories (44)(45). Some of these methods were tested recently in studies evaluating the performance of automated nonisotopic immunoassays against that of a direct RIA (29)(30)(32)(46) or against the performance of other nonisotopic automated immunoassays (47)(48). Most of these studies found acceptable concordance between nonisotopic immunoassays, although the results were poorer for women (29). These initial analyses were encouraging, although, as shown by Ognibene et al. (47), there are still unanswered questions concerning the extent of agreement between the various tests. Indeed, two of the three nonisotopic immunoassays tested by Ognibene et al. on samples from the entire adult population gave highly scattered results in the low concentration end of the dynamic range (ACS-180) or revealed discrepancies attributable to marked underestimation (Immulite 2000) with respect to ID/GC-MS. The Architect i2000 seems to be the only analyzer showing concordance with ID/GC-MS in the lower end of the range (47). Similar observations were made previously for the ACS-180 by Fitzgerald and Herold (28), who clearly demonstrated that this method gave concordant results primarily with samples from men and not with samples from women. This sex-specific effect was particularly marked at low testosterone concentrations, for which ACS-180 gave values higher than those obtained by ID/GC-MS, as reported by Jockenhövel et al. (49). Wheeler et al.(29), using serum pools with testosterone concentrations established by ID/GC-MS, demonstrated that the ACS-180 significantly overestimated testosterone concentrations in both female and male ranges.

Our study provides a broad view of the potential of most of the available automated nonisotopic and isotopic testosterone immunoassays to measure testosterone concentrations with accuracy in various groups of individuals reflecting the wide range of testosterone concentrations encountered in clinical practice. In samples from males, the median testosterone concentrations obtained by all immunoassays but the Architect i2000, Immuno-1, and Coat-A-Count DPC differed significantly from those obtained by ID/GC-MS (Table 4⇑ ). This preliminary analysis did not provide information about the scatter of the results between each immunoassay and ID/GC-MS. The equation of the weighted Deming regression analysis showed that all immunoassays but the AutoDelfia were quite well correlated with ID/GC-MS. The slopes were evenly distributed close to 1.0, with 7–10% variation (excepted for the Immulite 2000), and intercepts were highly dispersed.

For some immunoassays, the agreement with ID/GC-MS was not so good, as shown by the line of best fit for these assays (Fig. 1⇑ in the online Data Supplement). Plots clearly showed that the Vitros Eci, Vidas, Elecsys 2010, and RIA Immunotech underestimated testosterone concentrations over the concentration ranges tested. In contrast, the ACS-180 overestimated testosterone concentrations in the same range (Fig. 1⇑ in the online Data Supplement). The Bland–Altman analysis showed that the highest magnitudes of disagreement (mean difference >4 nmol/L) were observed with the Vidas, Elecsys 2010, and RIA Immunotech. The ratio method confirmed that these three assays and the Vitros Eci clearly underestimated testosterone concentrations over the male concentration range (Fig. 3⇑ ). We have analyzed the results obtained for samples from men with testosterone concentrations around the cutoff for hypogonadal status. Two methods, ACS-180 and AutoDelfia, overestimated testosterone values in this part of the concentration range and did not clearly separated normo- from hypogonadal individuals. It could be assumed that this observation could interfere with the capacity of these immunoassays to adequately distinguish men with low-normal testosterone concentrations from hypogonadal and/or androgen-blocked men. This could be directly addressed by investigating such individuals and studying their follow-up under treatment.

In samples from women, the mean testosterone concentrations measured by all immunoassays but the Vidas and Vitros Eci differed significantly from those measured by ID/GC-MS (Table 4⇑ ). In addition, the equation obtained by weighted Deming regression analysis showed that all methods were less well correlated with ID/GC-MS (Table 5⇑ ). Slopes were clearly >1.0 (excepted for Elecsys 2010) and were widely dispersed. Intercepts were also dispersed. The results obtained by all immunoassays did not clearly agree with ID/GC-MS, as shown by the lines of best fit (Fig. 1⇑ in the online Data Supplement). The Immulite 2000, Architect i2000, AutoDelfia, and ACS-180 markedly overestimated testosterone concentrations. The Immulite 2000 gave the worst results in terms of bias, with dispersion of the results over the entire concentration range, and in term of the correlation coefficient obtained (Fig. 1⇑ of the online Data Supplement and Table 5⇑ ). The Bland–Altman analysis showed that the highest magnitudes of disagreement (mean difference >2 nmol/L) were for the Immulite 2000, AutoDelfia, and ACS-180 (Table 4⇑ ). The plots from the ratio method showed a marked overestimation of testosterone concentrations with a high degree of scattering for the Immulite 2000, AutoDelfia, and to a lesser extent, Immuno-1 (Fig. 3⇑ ). These observations could be reinforcing by the imprecision of these methods (particularly the Immulite 2000; Table 1⇑ ). In a clinical situation, most if not all the immunoassays tested were unable to adequately distinguish females with testosterone concentrations within the reference interval for women from females with abnormal testosterone concentrations. The Elecsys 2010 may adversely affect clinical decision-making by giving testosterone results within the reference interval for women diagnosed with abnormal testosterone values and hirsutism.

There are several possible reasons for the lack of agreement between the results obtained with the immunoassays tested and ID/GC-MS for human samples. The first is the so-called matrix effect: immunoassays were performed, as recommended by the manufacturers, with no extraction steps. Thus, certain compounds present in serum, such as lipids and proteins, specifically binding globulins [e.g., sex-hormone-binding globulin (SHBG)], may interfere with the immunoassay. This effect has been observed in isotopic immunoassays (26)(50)(51), although the authors of another study obtained conflicting results (52). In a more recent study, Boots et al.(31) also reported such an effect. A similar effect has also been observed in nonisotopic assays (29). The interference involved in the matrix effect may contribute to the divergence of the results obtained by ID/GC-MS and immunoassays and may account for the marked overestimation of testosterone concentrations observed with the ACS-180 and, to a lesser extent, with the Immuno-1 over the entire range of testosterone concentrations (Fig. 3⇑ ). This discrepancy is consistent with high blank values. It may also account for the marked underestimation of testosterone concentrations observed with the Elecsys 2010 and, to a lesser extent, with the Vidas and Vitros Eci, probably resulting from excessive correction of the matrix effect. The addition of steroid-releasing agents is unlikely to increase the accuracy of these three immunoassays (ACS-180, AutoDelfia, and Elecsys 2010).

Another possible reasons for the lack of agreement between the results obtained with the immunoassays tested and ID/GC-MS for human samples is the cross-reactivity of the antibody. The immunoassay protocols included the use of specific polyclonal (ACS-180, Immuno-1, Vidas, Immulite 2000, AutoDelfia, RIA Immunotech, and Coat-A-Count DPC) or monoclonal (Architect i2000, Vitros Eci, and Elecsys 2010) antibodies (Table 1⇑ ), but the antibodies did not seem sufficient to increase the specificity of the immunoassay. Indeed, direct immunoassays display cross-reactivity with structurally related steroids (53) or therapeutic agents that also interact with SHBG in the serum (54). Interactions have also been observed between steroid tracers, the corresponding antibodies, and SHBG. Thus, the binding of other nontritiated tracers to serum proteins may be a source of interference in some direct steroid immunoassays, depending on their affinity for SHBG and the antibody used (50)(55). Together, these factors led to overestimation of steroid concentrations.

Other possible reasons for the lack of agreement between the results obtained with the immunoassays tested and ID/GC-MS for human samples include the limit of detection and functional sensitivity, which are particularly important for the determination of low (<1.7 nmol/L) and very low (0.17 nmol/L) testosterone concentrations (1)(22). Testosterone concentrations were determined in male and female infants (age range, 2–12 years; n = 11); in all but two samples, the testosterone concentration was <0.15 nmol/L, the limit of detection of our ID/GC-MS. We observed a high degree of variability among immunoassays, as shown in Table 3⇑ . The failure of the Elecsys 2010 to detect low testosterone concentrations contradicts the analytical characteristics provided by the manufacturer in the package insert (Tables 1⇑ and 3⇑ ). When we considered the expected ranges of values in both female adults and in female and male infants (Table 3⇑ ), we observed discrepancies between these values and the potential of the assay to detect testosterone at low or very low concentrations. This was true for both the Architect i2000 and Elecsys 2010, for which the expected values in infants were unrelated to the analytical characteristics of these assays, including functional sensitivity (0.48 nmol/L for Architect and 0.42 nmol/L for Elecsys 2010; Tables 1⇑ and 3⇑ ). It was also true for the lower limit of the range of values for samples from women, especially for the Immulite 2000, ACS-180, Elecsys 2010, and Coat-A-Count DPC. Unfortunately, this question is still in debate for the Immuno-1, Vidas, Vitros Eci, AutoDelfia, and RIA Immunotech because of the absence of defined values. It is unlikely that these immunoassays could achieve these very low expected values with respect to their relatively high imprecision in the low end of the testosterone range (Table 1⇑ ).

Otherwise, in females, the upper limits of the testosterone concentrations were comparable for all assays but the Immulite 2000 (4.16 nmol/L), AutoDelfia (3.70 nmol/L), and Vidas (3.12 nmol/L). However, when the main analytical characteristics and performances of each assay in the female range were considered, it should be stated that two groups of immunoassays, compared with ID/GC-MS, could be defined. One group of methods, including the Immulite 2000, ACS-180, AutoDelfia, Architect i2000, RIA Immunotech, Coat-A-Count DPC, and Immuno-1, was characterized mainly (to different degrees) by results that differed significantly from those obtained by ID/GC-MS, a high magnitude of disagreement, wide scatter of the data, and a considerable percentage of individuals with abnormal testosterone concentrations when compared with the upper limit of the ID/GC-MS method. The second group of methods, including the Vitros Eci and Vidas, was characterized by results that were not significantly different from those obtained by ID/GC-MS, lower dispersion of the data, and lower percentages of individuals with abnormal testosterone concentrations when compared with the upper limit of the ID/GC-MS method. However, for high or very high percentages of individuals with abnormal testosterone concentrations as measured by ID/GC-MS, the results obtained by the Vitros Eci, Vidas, and Elecsys 2010 were within the reference interval. This tendency was not improved when corrected by the upper limit of each respective testosterone assay. Overall our results did not allow us to define a strategy or algorithm for each assay to assess how automated nonisotopic testosterone immunoassay methods can and should be used to evaluate female patients in clinical practice.

In conclusion, none of the immunoassays tested was reliable enough for the investigation of the very low and low testosterone concentrations (0.17–1.7 nmol/L) expected in sera from children and women. These assays are therefore unlikely to be useful for diagnosis, follow-up of sexual differentiation, or general use in pediatric surveys. In addition, we consider two of these assays (ACS-180 and AutoDelfia) questionable at the low limit of testosterone concentrations in men. This observation should be confirmed.


We thank the following manufacturers for supplying assay reagents and systems free of charge: Abbott Laboratories, Bayer Diagnostics, Beckman-Coulter, Bio-Mérieux, Dade Behring, Diagnostic Products Corporation, Ortho-Clinical Diagnostics, Perkin-Elmer, and Roche-Boehringer-Mannheim.


References

↵ Migeon CJ, Berkovitz GD, Brown TR. Sexual differentiation and ambiguity. Kappy MS Blizzard RM Migeon CJ eds. Endocrine disorders in childhood and adolescence 1994:573-716 CC Thomas Springfield, IL. .


↵ Wang C, Swerloff RS. Evaluation of testicular function. Balllieres Clin Endocrinol Metab 1992;6:405-434.


↵ Wu FCW, Edmond P, Raab G, Hunte WM. Endocrine assessment of the subfertile male. Clin Endocrinol 1981;14:493-507.


↵ Gleave ME, Goldenberg SL, Jones EC, Bruchovski N, Sullivan LD. Biochemical and pathological effects of eight months of neoadjuvant androgen withdrawal therapy before prostatectomy in patients with clinically confined prostate cancer. J Urol 1996;155:213-219.


↵ Jocham D. Leuprorelin three-month depot in the treatment of advanced and metastatic prostate cancer: long-term follow-up results. Urol Int 1998;60:18-24.


↵ Kuhn JM, Billebaud T, Navratil H, Moulonguet A, Fiet J, Grise P, et al. Prevention of the transient adverse effects of a gonadotropin-releasing hormone analogue (buserelin) in metastatic prostatic carcinoma by administration of an antiandrogen (nilutamide). N Engl J Med 1989;321:413-418.


↵ Abbou CC, Lucas C, Leblanc V. Tolerance and clinical and biological responses during the first 6 months of treatment with 1-month sustained release LHRH agonist leuprolerin and triptolerin in patients with metastatic prostate cancer. Prog Urol 1997;7:984-995.


↵ Bergfeld WF. Hirsutism in women. Effective therapy that is safe for long-term use. Postgrad Med 2000;93:99-104.


↵ Rosenfield RL. Pilosebaceous physiology in relation to hirsutism and acne [Review]. Clin Endocrinol Metab 1986;15:341-362.CrossRefPubMed Order article via Infotrieve
↵ Mauvais-Jarvis P, Kuttenn F, Mowszowicz I. Hirsutism 1981 Springer Verlag Berlin. .


↵ Waggoner W, Boots LR, Azziz R. Total testosterone and DHEAS levels as predictors of androgen-secreting neoplasms: a populational study. Gynecol Endocrinol 1999;16:394-400.


↵ Rittmaster RS, Arab DM, Lehman L. Dose-response effect of depot leuprolide acetate on serum androgens in hirsute women. Fertil Steril 1996;65:912-915.


↵ Escobar-Morreale HF, San Millan JL, Smith RR, Sancho J, Witchel SF. The presence of the 21-hydroxylase deficiency carrier status in hirsute women: phenotype-genotype correlations. Fertil Steril 1999;72:629-638


↵ New MI, White PC, Pang S, Dupont B, Speiser PW. The adrenal hyperplasias. Scriver CR Beaudet AL Sly WS Valle D eds. The metabolic bases of inherited disease 1989:1881-1917 McGraw-Hill New York. .
↵ Migeon CJ, Berkovitz GD. Congenital defects of the external genitalia in the newborn and prepubertal child. Carpenter SE Rock JA eds. Pediatric and adolescent gynecology 1992:77-94 Raven Press New York. .


↵ Brook CGD. Precocious puberty. Clin Endocrinol 1995;42:647-650.


↵ Lee PA. Laboratory monitoring of children with precocious puberty. Arch Pediatr Adolesc Med 1994;148:369-376.


↵ Bertelloni S, Baroncelli GI, Ferdeghini M, Menchini-Fabris F, Saggese G. Final height, gonadal function and bone mineral density of adolescent males with central precocious puberty after therapy with gonadotropin-releasing hormone analogues. Eur J Pediatr 2000;159:369-374.


↵ Muller J, Juul A, Anderson AM, Sehested A, Skakkebaek NE. Hormonal changes during GnRH analogue therapy in children with central precocious puberty. J Pediatr Endocrinol Metab 2000;13:739-746.


↵ Kaplowitz P. Delayed puberty in obese boys: comparison with constitutional delayed puberty and response to testosterone therapy. J Pediatr 1998;13:745-749.


↵ Kulin HE, Finkelstein JW, D’Arcangelo MR, Susman EJ, Chinchilli V, Kunselman S, et al. Diversity of pubertal testosterone changes in boys: constitutional delay in growth and/or adolescence. J Pediatr Endocrinol Metab 1997;10:395-400.


↵ Fuqua JS, Sher ES, Migeon CJ, Berkovitz GD. Assay of plasma testosterone during the first six months of life: importance of chromatographic purification of steroids. Clin Chem 1995;41:1146-1149.


↵ Siekmann L. Determination of steroid hormones by the use of isotope dilution-mass spectrometry: a definitive method in clinical chemistry. J Steroid Biochem 1979;11:117-123.


↵ Lawson AM, Gaskell SJ, Hjelm M. International Federation of Clinical Chemistry (IFCC), Office for Reference Methods and Materials (ORMM). Methodological aspects on quantitative mass spectrometry used for accuracy control in clinical chemistry. J Clin Chem Clin Biochem 1985;23:433-441.


↵ Fiet J, Gosling JP, Soliman M, Galon H, Boudou P, Aubin P, et al. Coordinated radioimmunoassays for eight plasma steroids relevant to the investigation of hirsutism and acne in women. Clin Chem 1994;40:2296-2305.


↵ Luppa P, Neumeier D. Effect of sex-hormone globulin on no-extraction immunoassays for testosterone. Clin Chem 1990;36:172-173.


↵ Wudy SA, Wachter UA, Homoki J, Teller WM. 17α-Hydroxyprogesterone, 4-androstenedione, and testosterone profiled by routine stable isotope dilution/gas chromatography-mass spectrometry in plasma of children. Pediatr Res 1995;38:76-80.


↵ Fitzgerald R, Herold DA. Serum total testosterone: immunoassay compared with negative chemical ionization gas chromatography–mass spectrometry. Clin Chem 1996;42:749-755.


↵ Wheeler MJ, D’Souza A, Matadeen J, Croos P. Ciba Corning ACS:180 testosterone assay evaluated. Clin Chem 1996;42:1445-1449.


↵ Wians FH, Stuart J. Ciba Corning ACS:180 direct total testosterone assay can be used on female sera [Letter]. Clin Chem 1997;43:1466-1467.


↵ Boots LR, Potter S, Potter D, Azziz R. Measurement of total serum testosterone levels using commercially available kits: high degree of between-kit variability. Fertil Steril 1998;69:286-292.


↵ Levesque A, Letellier M, Swirski C, Lee C, Grant A. Analytical evaluation of the testosterone assay on the Bayer Immuno 1 system. Clin Biochem 1998;31:23-28.


↵ Sabot JF, Deruaz D, Dechaud H, Bemard P, Pinatel H. Determination of plasma testosterone by mass fragmentography using [3,4-13C] testosterone as internal standard. J Chromatogr 1985;339:233-242.


↵ Reiffsteck A, Dehennin L, Scholler R. Estrogens in seminal plasma of human and animal species: identification and quantitative estimation by GC-MS associated with stable isotope dilution. J Steroid Biochem 1982;17:567-572.


↵ Boudou P, Taieb J, Mathian B, Badonnel Y, Lacroix I, Mathieu E, et al. Comparison of progesterone concentration determination by 12 non-isotopic immunoassays and gas chromatography/mass spectrometry in 99 human serum samples. J Steroid Biochem Mol Biol 2001;78:97-104.


↵ Linnet K. Necessary sample size for method comparison studies based on regression analysis. Clin Chem 1999;45:882-894.


↵ Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-310.


↵ Nahoul K, Dehennin L, Scholler R. Radioimmunoassay of plasma progesterone after oral administration of micronized progesterone. J Steroid Biochem 1987;26:241-249.


↵ Wheeler MJ, Lowry C. Warning on serum testosterone measurement. Lancet 1987;2:514-515.


↵ Lashansky G, Saenger P, Fishman K, Gautier T, Mayes D, Berg G. Normative data for adrenal steroidogenesis in a healthy pediatric population: age- and sex-related changes after adrenocorticotropin stimulation. J Clin Endocrinol Metab 1991;73:674-686.


↵ Forest MG. Physiological changes in circulating androgens. Forest MG eds. Androgens in childhood 1989:104-129 Karger Basel. 


↵ Dorgan JF, Fears TR, McMahon RP, Aronson Friedman L, Patterson BH, Greenhut SF. Measurement of steroid sex hormones in serum: a comparison of radioimmunoassay and mass spectrometry. Steroids 2002;67:151-158.


↵ Fitzgerald RL, Herold DA. Reply: Ciba Corning ACS:180 direct total testosterone assay can be used on female sera. Clin Chem 1997;43:1467-1468.


↵ Gosling JP. A decade of development in immunoassay methodology. Clin Chem 1990;36:1408-1427.


↵ Wheeler MJ. Automated immunoassay analysers. Ann Clin Biochem 2001;38:217-229.


↵ Sanchez-Carbayo M, Mauri M, Alfayate R, Miralles C, Soria F. Elecsys testosterone assay evaluated. Clin Chem 1998;44:1744-1746.


↵ Ognibene A, Drake CJ, Jeng KY, Pascucci TE, Hsu S, Luceri F, et al. A new modular chemiluminescence immunoassay analyser evaluated. Clin Chem Lab Med 2000;38:251-260.


↵ Gonzales-Sagrado M, Martin-Gil FJ, Lopez-Hernandez S, Fernandez-Garcia N, Olmos-Linares A, Arranz-Pena ML. Reference values and methods comparison of a new testosterone assay on the AxSYM system. Clin Biochem 2000;33:175-179.


↵ Jockenhövel F, Haase S, Hoermann R, Mann K. New automated direct chemiluminescent immunoassay for the determination of serum testosterone. J Clin Ligand Assay 1996;19:138-144.


↵ Masters AM, Hahnel R. Investigation of sex-hormone binding globulin interference in direct radioimmunoassays for testosterone and estradiol. Clin Chem 1989;35:979-984.


↵ Slaats EH, Kennedy JC, Kruijswijk H. Interference of sex-hormone binding globulin in the “Coat-A-Count” testosterone no-extraction radioimmunoassay. Clin Chem 1987;33:300-302.


↵ Bodlaender P. No SHBG interference with the “Coat-A-Count Total Testosterone” direct RIA kit. Clin Chem 1990;36:173.


↵ Chattoraj S. Endocrine function. Tietz NW eds. Fundamentals of clinical chemistry 1976:699-817 WB Saunders Philadelphia. .


↵ Pugeat MM, Dunn JF, Nisula BC. Transport of steroid hormones: interaction of 70 drugs with testosterone-binding globulin and corticosteroid-binding globulin in human plasma. J Clin Endocrinol Metab 1981;53:69-75.


↵ Micallef JV, Hayes MM, Latif A, Ahsan R, Sufi SB. Serum binding of steroid tracers and its possible effects on direct steroid immunoassay. Ann Clin Biochem 1995;32:566-574.