Dear Sir, We read with great interest the paper by Afolabi et al. [1]. It is true that 13C-liver function breath tests (13C-LFBTs) appear to be attractive both to patients and physicians because of their non-invasive protocol, as opposed to diagnostic procedures infringing the body integrity, which inherently entails a liver biopsy. Unfortunately, after almost three decades during which 13C-LFBTs were available to clinicians, those tests still have not paved their way to become a routine diagnostic tool. The breakthrough made by the quoted paper consists in the clear delineation of targets for future research which, if attained, should gain an objective view upon the clinical usefulness of those tests. Accordingly, phase one of this validation process should involve evaluation of reproducibility, the second one involves the assessment of prognostic utility, and ultimately the third phase should investigate the effect of 13C-LFBTs upon the patients’ outcome [1]. One should be aware that the result of a breath test, like in the case of any other quantitative diagnostic method applied in medicine, may contain a certain degree of inexactitude because an immanent feature of any measurement is its proneness to random as well as systematic errors. Therefore it is necessary to identify possible sources of measurement errors and to estimate of their contribution to the overall error of a diagnostic method and, as the ultimate step, to undertake means to possibly minimize it. In the case of 13C breath tests, the total measurement error will be accounted for by the precision and exactitude of the apparatus used to determine the content of 13CO2 within samples of the expired air, degree of conformity with the recommended protocol of accomplishing the test, and inherent biological variability of the living organism undergoing a diagnostic procedure. The error introduced by the measurement equipment is relatively easy to estimate, because it will be characterized by sensitivity, linearity range, as well as by within- and between-series consistency of measurement results. Those items are basically addressed by manufacturers, and a daily routine of calibration assures the maintenance of optimum performance of the equipment. Knowledge of the performance of the measuring system is of course necessary to adjust an optimum dosage of the 13C-labeled substrate applied for a given breath test [2]. Minimization of the error associated with the implementation of a breath test is achieved by standardization of the composition and method of preparing a test meal, the time allowed for its consumption, number, time intervals and the method of sampling the expiratory air, as well as the ambiance offered to the examined subjects while undergoing the examination. The set of interventions usually undertaken in this respect comprises advice and recommendations given to subjects with regard to some restrictions that have to be observed before an examination (like remaining fasting, abstaining from smoking cigarettes, withdrawal of use of medication), and the behavior during the test (avoidance of physical activity, maintenance of a recommended body position, refraining from smoking and from taking meals or drinks other than that provided by the laboratory staff) [3–6]. Measures undertaken in an attempt to control the error introduced by biological variability may consist in a strict observation of a constant time of day when the test is performed, or, in the case of women at a reproductive age—in taking into account their menstrual cycle status [7, 8]. We totally agree with Afolabi et al. [1] that a valuable tool enabling the assessment of the performance of a quantitative measurement in medicine is the determination of the reproducibility of the results it provides. An estimation of the gross error of the diagnostic method, accounted for by the factors and circumstances described above, can be thus obtained. Consequently, a poor reproducibility of a test will in fact determine its unsuitability for clinical applications, since it means that the results of tests performed in the same person under identical conditions may largely differ from one another [9]. It is our pleasure to provide herein additional data on reproducibility of the 13C-LFBTs, not reported in the paper by Afolabi et al. [1]. In our laboratory we pursued prospective evaluation of the reproducibility of the 13C-methacetin breath test (13C-Meth-BT) [10], the 13C-alpha-ketoisokaproic acid breath test (13C-KICA-BT) [11], and the 13C-phenylalanine breath test (13C-PhenAla-BT) [12]. Thus, insight on the precision of the representatives of three main groups of the 13C-LFBTs has been obtained, since the 13C-Meth-BT evaluates the microsomal liver metabolism, whereas the 13C-KICA-BT and the 13C-PhenAla-BT are dedicated to assess the mitochondrial and the cytosolic metabolic efficiency of the liver, respectively. The results of the short-term (repeat examination were taken 1–3 days apart) and the medium-term (the repeat measurements were separated by a 2–3-week break) reproducibility of the three 13C-LFBTs are assembled in Table 1. The common denominator of those data is that Tmax—the time to reach the peak of 13CO2 concentration in expiratory air—is considerably less reproducible than the two other quantitative parameters of the 13C-LFBTs, namely, the maximum momentary elimination (Dmax) which is characterized by a fair reproducibility, and the best reproducible cumulative elimination of 13C in breath air, conveyed as the area under the 13C elimination curve (AUC). In no instance did the medium-term reproducibility prove any worse than the short-term one (Table 1). Quite strikingly, taking into account the magnitude of the pertinent coefficients of variation for paired examinations (CVp), the reproducibility of the 13C-PhenAla-BT appears to be remarkably worse than in the case of either the 13C-Meth-BT or 13C-KICA-BT. The latter finding raises concerns whether the precision of the 13C-PhenAla-BT may be sufficient to yield clinically sound conclusions. Table 1 Reproducibility of three liver breath tests Detailed analyses of the reproducibility data referred in this correspondence have been published elsewhere [10, 11]. In summary, we would like to recall some important observations. First, in the case of 13C-Meth-BT it was found that on repeat examinations the exactitude of AUC may be modestly affected by a persistent stimulation of CYP1A2 responsible for a fixed bias which amounted to 8 % [10]. Second, achievement of a necessary reproducibility level of the 13C-KICA-BT requires calculation of the AUC for a time span from within the range between 0 and 90 min or even better, for 0–120 min [11]. We do hope that the data and remarks contained herein supplement and support the idea of systematic validation of 13C-LFBTs outlined in the paper by Afolabi et al. [1].