Korovessis, Panagiotis, Dimas, Anastasios, Koureas, Georgios, Zacharatos, Spyridon, Petsinis, Georgios, and Baikousis, Andreas
The aim of this study was to determine inter- and intraobserver agreement between spine surgeons and orthopedic radiologists in recognizing distinct degenerative pathology on plain lumbosacral roentgenograms; to estimate the validity (sensitivity and specificity) to make a surgical decision by correlating Short form-36 Health Survey (SF-36) scores and roentgenographic degenerative pathology; and to determine the intra- and interobserver agreement between radiologists, surgeons, and authors in making a surgical decision for treatment on the basis of distinct roentgenographic pathology, SF-36 scores, clinical findings derived from physical examination, or combined. The authors followed three routes to objectively assess the reliability and validity of the surgical decision in chronic low back pain patients: First, 100 consecutive male patients who suffered from low back pain were examined by the authors physically, using imaging techniques (including plain roentgenograms, CT-scan, or/and MRI), and SF-36 survey. Two senior orthopedic radiologists and two senior spine surgeons were asked to read blinded a set of 100 roentgenograms of the lumbar spine in two sessions. Second, surgeons and radiologists were asked to make a surgical decision in each particular patient using either SF-36 scores or plain roentgenograms or matched SF-36 data and roentgenograms, and these decisions were compared with each other as well as with the authors’ decision, which was based on combined imaging (roentgenograms, CT, MRI), SF-36 scores, and surgical findings. The intra- and interobserver reliability on recognizing distinct pathological findings on plain roentgenograms was expressed in terms of kappa, and the validity of the approach that was used by the authors in the present study to calculate the proportion of normal and abnormal roentgenograms, which are likewise “diagnosed” by the level of SF-36 scores shown in terms of sensitivity, specificity, and positive and negative predictive value. The impact of subjectivity in making a surgical decision was tested by randomly asking half of the patients to magnify and the other half to undermagnify their SF-36 data. The prevalence of roentgenographic pathology detected by radiologists, surgeons, and authors was 0.51, 0.53, and 0.54 respectively, while in only 28% of the patients the authors posed the indication for surgery. The interobserver agreement in detecting roentgenographic pathology in the first and second session was for radiologists 0.79 and 0.88 respectively, and 0.81 and 0.91 for surgeons respectively. The average intraobserver agreement in detecting pathology on plain lumbosacral roentgenograms was 0.92 and 0.91 for radiologists and surgeons respectively. The sensitivity and specificity of the method to calculate the surgeons’ proportions of normal and abnormal roentgenograms, which are likewise “diagnosed” by the level of SF-36 scores, was 0.49 and 0.36 respectively. When this calculation was made by radiologists, the sensitivity and specificity was 0.47 and 0.35 respectively. The positive predictive value was 0.38 for both radiologists and surgeons. The consensus between independent radiologist to make a surgical decision using only roentgenographic data was 0. 71 while between spine surgeons based on SF-36 data it was 0.83. The consensus between spine surgeons (decision based on SF-36 data) and orthopedic radiologists (decision based on radiology) was 0.41. In surgical decisions made by spine surgeons using matched SF-36, roentgenographic, and surgical examination data, the consensus raised to 0.67. The consensus between spine surgeons making a surgical decision using matched roentgenographic, SF-36, and surgical examination was 0.58 The impact of subjectivity that was tested when randomly half of the patients were asked to magnify or undermagnify their SF-36 data erroneously increased the sensitivity to 0.54 and 0.59 respectively. This investigation showed that distinct degenerative lumbar spinal pathology can be identified on plain roentgenographs with similarly high accuracy by orthopedic, radiologists, and spine surgeons. The sensitivity and specificity of recognizing abnormal and normal roentgenograms using normal and abnormal SF-36 data was low because of the subjective nature of the SF-36 survey. This study additionally concluded that any surgical decision should not be taken on the basis of any roentgenographic pathology or on what the patient says in the SF-36 questionnaire, but on the basis of matched SF-36 scores, roentgenographic and imaging evaluation, and physical examination data. [ABSTRACT FROM AUTHOR]