1. Accuracy of the GastroPanel test in the detection of atrophic gastritis
- Author
-
Semi Korpela, Kari Syrjänen, Penti Sipponen, Matti Härkönen, and Ants Peetsalu
- Subjects
Gastritis, Atrophic ,Male ,medicine.medical_specialty ,Hepatology ,Receiver operating characteristic ,business.industry ,Atrophic gastritis ,Intraclass correlation ,Concordance ,Gastroenterology ,Area under the curve ,Context (language use) ,medicine.disease ,3. Good health ,Surgery ,Verification bias ,Internal medicine ,Medicine ,Cutoff ,Humans ,Female ,business ,Letters to the Editor ,Biomarkers - Abstract
We read with concern the paper of McNicholl et al. 1 on the accuracy of GastroPanel (GP) in the diagnosis of atrophic gastritis (AG). The major conclusions of the paper are in sharp contrast with the earlier literature 2–6 and the most recent international consensus statements. The skewed GP results reported in the study 1 could be due to any or all of three main reasons: (a) poor laboratory techniques; (b) misclassification bias of the study endpoint (AG); and (c) inadequate statistical power (n=85, 10 with AG). First of all, it should be noted that the ‘Biohit-Deltaclon GastroPanel’s Lab’ in Spain, where the authors reported their analyses had been carried out 1, has no contract in force with Biohit Oyj (Helsinki, Finland), and accordingly no rights to use either the name Biohit or GastroPanel in this context. It is emphasized that the GP test (Biohit Oyj) is an enzyme-linked immunosorbent assay test and has not been optimized (or even tested) by the manufacturer for use as a chemiluminescent enzyme immunoassay 1. This type of technical modification inevitably entails that the manufacturer-recommended cutoff values are not valid in the new application. Chemiluminescent assay should have been validated against the reference GP test to confirm the appropriateness of the cutoff values that the authors have used (Figure 1) 1. Even minor deviations from the appropriately validated cutoff values in a study with a limited number of cases (only 10 with AG) would lead to markedly distorted results. Plasma pepsinogen I (PGI) levels and severity of AG show a practically linear relationship 2–5. PGI levels less than 25 µg/l and PGI/PGII ratio less than 3.0 are consistent with moderate or severe AG of the corpus 3. Therefore, it is surprising that the mean serum level of PGI for patients with AG of the corpus reported in this study is 101 µg/l (Table 1) 1. On the basis of extensive clinical series, such PGI values are impossible in patients with biopsy-confirmed moderate or severe AG of the corpus 2–6. The GP test is optimized to be used in context with the Updated Sydney System (USS) for classification of gastritis 2,6. The five diagnostic categories – (a) normal mucosa, (b) superficial gastritis, (c) atrophic antrum gastritis, (d) atrophic corpus gastritis, and (e) atrophic pangastritis – are common to both the GP test and the USS, which enables direct assessment of their concordance using, for example, weighted κ (intraclass correlation coefficient) testing. When this is done in an adequately powered study based on validated USS classification, an interassay (GP-to-USS) agreement is usually in the range of 0.7 to greater than 0.8 (substantial to almost perfect) 2,6. This information on the overall test agreement was missing in the present report 1. The lack of this key information invalidates the correct interpretation of the GP results and also precludes any meaningful calculations on GP performance as an indicator of the AG (study endpoint). Mild AG of the corpus should never be used as the study endpoint in calculating the performance indicators of the PGI, PGI/PGII, as repeatedly emphasized 2–6. This fact has been neglected in the present study, in which the GP cutoff values presented in Figure 1 algorithm are indicated for AG in general and not stratified according to the grade of AG 1. The only appropriate way of calculating the predictive indicators of PGI and PGI/PGII ratio for AG of the corpus is to use the combined moderate/severe AG as the study endpoint. This approach in an adequately powered study with validated USS classification gives receiver operating characteristic (area under the curve) values above 0.970 for PGI and greater than 0.950 for the PGI/PGII ratio 2–6. Another unique feature of the GP test is the interpretation of the results by specific software (GastroSoft Biohit Oyj, Helsinki, Finland), which is almost mandatory in their correct interpretation. The authors did not report using GastroSoft in their study 1. The role of the G-17 biomarker is more complex. Low levels of G-17 are not exclusively inherent to antral AG, but may also reflect high gastric acid output, whereas high volumes may result from the use of proton pump inhibitors 2,7,8. In fact, the use of G-17 is not recommended by the GP manufacturer for the diagnosis of antral AG. Finally, in a study including only 10 patients with (unclassified) AG in a clinical setting, one cannot draw any conclusions whatsoever on the use of the GP test in a screening setting. Such a setting would necessitate an adequately powered cohort of population-derived (asymptomatic) individuals, all being tested by GP, with all test positives (and random 5% of test negatives) to be confirmed by gastroscopy and validated biopsy classification, and, importantly, all performance indicators being corrected for verification bias by special statistical treatment.
- Published
- 2015