Abstract

Vertebral fractures are a major public health problem and make up for the highest number of fragility fractures.1 They can be asymptomatic, or their symptoms may be nonspecific; hence, they may not come to clinical attention. Other diseases of the spine may also induce similar symptoms. Thus, the coexistence of a backache and a radiological vertebral fracture is no proof that the backache is due to such fracture.2 Vertebral fractures may be associated with poor life quality, impaired bending and rising, difficulties in the activities of daily living, frailty, and leg disability.3-6 Because individuals with vertebral fractures have high risk of fractures, including hip fracture, it shows that vertebral fractures are indeed a major sign of bone fragility.7, 8 The higher the number and the severity of vertebral fractures, the higher the risk of a new fracture.9, 10 Vertebral fractures are associated with higher risk of institutionalization, hospitalization, and higher mortality (irrespective of those related to fractures).11-13 The diagnosis of vertebral fractures requires spine imaging by lateral radiography or lateral DXA scan. Vertebral fractures may be diagnosed by chance on an X-ray requested for other reasons.14 Diagnostic tests may then be performed (bone mineral density [BMD] measured by DXA, biological assays) and treatment may be started. However, vertebral fractures are underreported on radiological reports.15 The term “fracture” is often avoided and replaced by ambiguous and misleading terms such as “deformity,” “wedging,” “collapse,” or “reduced vertebral height.” A vertebral fracture provides indisputable evidence of reduced bone strength. It is a sign of the disease associated with higher fracture risk and higher mortality. The term “fracture” has more direct therapeutic consequences than other terms and should be used in the radiological reports. The most probable reason underlying these obfuscations is the lack of clear diagnostic criteria of vertebral fractures. Over the last years, two approaches to diagnose vertebral fractures were developed: clinical algorithms based on visual assessment of the vertebrae and vertebral morphometry.16, 17 In the semiquantitative score, vertebrae are graded on the basis of visual inspection without direct measurement into three types: wedge (reduction in anterior height), biconcave (reduction in central height), and crush (reduction in posterior height). Each type is subdivided into three grades: mild (20% to 25% reduction in the height), moderate (25% to 40% reduction in the height), and severe (more than 40% reduction in the height).16 The semiquantitative method does not take into account any external reference values with which the investigated height may be compared to. The observer needs to “compare” the studied height to a theoretical height that this vertebral body should have; eg, an approximated value as the mean of the respective heights of adjacent vertebral bodies. Therefore, this method requires adequate radiological training. The Algorithm-Based Qualitative (ABQ) tool assumes that the endplate is always deformed in vertebral fractures and is a 100% sensitive in case of vertebral fracture.17 Because the ABQ tool does not consider the endplate depression to be specific for vertebral fracture, it provides indications for the comprehensive assessment of the vertebra (eg, scoliosis, osteoarthritis, Schmorl's nodes).17 However, some pitfalls should be signaled. In rare cases, a vertebral fracture may occur only by cortical buckling without endplate depression. Lumbar vertebrae may have normal concavity of the endplate resulting in false positives that are diagnosed by the ABQ tool. In both of these cases, comparing the vertebral body with the adjacent vertebrae may provide the correct answer. Finally, in the case of old vertebral fracture, the endplate may remodel over time and its discontinuity or depression may no longer be visible. Vertebral morphometry was developed in the 1990s. The morphometric algorithms are based on measurements of vertebral heights and calculations of their ratios.18-22 The ratios are calculated for each vertebra and for each type of fracture (anterior/posterior height ratio for wedge fractures, central/posterior height ratio for biconcave fractures, posterior/adjacent posterior height for crush fractures). Thresholds of normality are defined for each ratio using various approaches, such as the number of standard deviations (SDs) below the mean or percentage of the average height ratio. The arithmetical mean and SD are established for each ratio in a group of young healthy adults of the same sex from the same population or in the investigated cohort after exclusion of extreme values using various algorithms. A vertebral fracture is diagnosed when the vertebral height ratio is below the vertebra-specific, sex-specific, and ratio-specific thresholds. Vertebral morphometry is quantitative and uses the level-specific, ratio-specific, and sex-specific reference values. Because it takes into account natural variations in the shape of vertebral bodies related to the spine curvatures, it seems more objective than the qualitative algorithms. However, vertebral morphometry is laborious and depends on the X-ray quality. Rotation of the spine, blurred edges of a vertebra, or oblique ray due to parallax render the measurements of vertebral heights subjective and, consequently, decrease their accuracy and reproducibility. The problem related to parallax may be reduced by carefully centering the X-ray beam over Th8 for the thoracic spine and over L3 for the lumbar spine. It can be avoided by using lateral spine scans obtained on fan-beam DXA machines. DXA scans are less irradiating than radiographies and provide excellent agreement with radiography for grade 2 and 3 fractures, although slightly poorer for grade 1.23 A similar comparison (DXA versus X-ray) was performed on vertebral morphometry with excellent agreement for more severe vertebral fracture (less than mean – 4SDs) and poorer agreement for milder deformities.24 The morphometric algorithms are based on the vertebral heights and their ratios. As most of them were built on the same conceptual basis, they show good agreement. However, vertebral morphometry does not account for the shape of the entire vertebra nor for the general aspect of the whole spine, which permits to differentiate fractures from other diseases. Vertebral morphometry does not differentiate wedge fractures from similar wedge deformities observed in osteoarthritis, senile kyphosis, or Scheuermann's disease. Therefore, the agreement of the morphometry with the clinical tools of vertebral fractures (Genant's semiquantitative score, Jiang's ABQ) was poorer.16, 17 It indicates that the agreement between two morphometric algorithms does not necessarily mean that the fracture diagnosed by both is a genuine vertebral fracture. It may reflect the fact that they are built on the same assumptions. Vertebral morphometry is performed by a trained technician and, at the level of a single X-ray, is less expensive than having radiographs evaluated by a radiologist. This may be preferable in large epidemiological or pharmaceutical studies. However, vertebral morphometry may provide a higher number of false positives and, to a lesser extent, a higher number of false negatives. Thus, its use may require the recruitment of a larger cohort, which is far costlier. Both the vertebral fractures diagnosed using semiquantitative score and those diagnosed using vertebral morphometry are associated with higher risk of vertebral or nonvertebral fracture. The more severe the vertebral fracture at baseline (using Genant's score), the higher is the risk of a new fracture.10, 25 Vertebral fractures diagnosed using more stringent morphometric criteria (eg, mean – 4SD) are also associated with higher point estimates of the risk of fracture versus the less stringent criteria (eg, mean – 3SD).26 The two articles published in this issue of the JBMR present new endeavors to improve the diagnostics of vertebral fractures. Oei and colleagues27 compared prevalence and location of vertebral fractures diagnosed using the ABQ tool and SpineAnalyzer software-assisted quantitative morphometry in 7582 men and women aged 45 to 95 years from the population-based Rotterdam cohort. SpineAnalyzer software-assisted quantitative morphometry (QM SA) automatically identifies vertebral shape to calculate the exact vertebral heights.28 The software automatically diagnoses vertebral fractures using the thresholds based on Genant's semiquantitative score. Unlike the original approach described by Genant (visual inspection without measurement), the software measures vertebral heights and calculates their ratios. The agreement between the methods was assessed using the kappa score and the Prevalence Adjusted Bias Adjusted Kappa (PABAK; described in the article). Vertebral fracture prevalence found using the ABQ method is very low, even after the age of 80 years. The prevalence obtained by QM SA is nearly fourfold higher. Vertebral fracture prevalence is twofold higher in women using the ABQ method, but slightly higher in men using the QM SA. The agreement between ABQ and QM SA calculated using the kappa score is poor, especially in men (versus women), in younger groups (versus the elderly), and in the thoracic spine (versus lumbar spine). The exclusion of mild QM SA fractures improved the agreement. The use of the PABAK approach also improved the agreement compared to the kappa score. The difference in the prevalence and the poor agreement between the methods show that the fractures identified by each method are qualitatively different deformities. The prerequisite for diagnosis is a vertebral height reduction for QM SA and an endplate depression for ABQ. This difference in fracture prevalence may indicate that a substantial number of fractures have no detectable endplate depression. However, it may also suggest that in many cases, reduced vertebral height does not reflect a fracture. Morphometric studies showed that men have more wedged vertebral bodies than women, especially on the thoracic kyphosis.29 Thus, there are more men than women who have the anterior height >20% lower than the posterior height. This difference may explain the higher vertebral fracture prevalence found by QM SA in men than in women. It also shows that QM SA included more atypical normal vertebrae in the fracture group in men than women. Thus, the better agreement between these methods in women as opposed to men may be due to fewer false positives in women. Because the elderly have more fractures, fewer deformities may be false positives in older than in younger subjects. This may explain the better agreement between the methods in older versus younger participants. The false positives identified by QM SA are more frequent on the thoracic kyphosis where vertebrae are physiologically slightly wedged. This may explain the low level of agreement between methods in this region. Because mild fractures may be false positives, the endplate depression is less frequent in mild than in moderate or severe fractures. This may explain why the exclusion of mild vertebral fractures improved the agreement between the methods. Lentle and colleagues30 also compared two diagnostic methods in 5319 men and women aged 50 years and over from the multicenter Canadian Multicentre Osteoporosis Study (CaMos) cohort. Genant's semiquantitative score (GSQ) defines grades of severity expressed by the magnitude of vertebral height reduction: mild (20% to 25%, GSQ1), moderate (25% to 40%, GSQ2), and severe (>40%, GSQ3).16 The modified ABQ (mABQ) tool is based on two traits.17 The prerequisite for the diagnosis of a vertebral fracture included the endplate depression and cortical buckling or breaks (<1% of fractures). Then, the fractures were graded (mABQ1 to mABQ3) using Genant's thresholds. The interrater agreement (radiologist versus technologist) was poor for GSQ1 fractures, but good for GSQ2 to GSQ3 fractures. Agreement among physician readers for vertebral fractures (classified as present versus absent) was better for the mABQ tool than for GSQ score. The agreement for GSQ fractures improved when individuals with GSQ1 fractures were pooled with those without fracture. A large proportion of vertebrae identified as fractured by GSQ are not considered to be fractured by mABQ criteria. Agreement between the scores was low, mainly for mild and moderate fractures. By contrast, the concordance was very high for GSQ-normal vertebrae (versus mABQ-normal) and for GSQ3 versus mABQ-3 fractures. After adjustment for age, BMI, and height, participants with fractures diagnosed by mABQ had lower BMD compared both with those without vertebral fracture (diagnosed by GSQ or mABQ) and with those having fractures diagnosed by GSQ only. In a similar multivariable model, subjects having only GSQ1 had significantly lower BMD than those without vertebral fracture, but significantly higher BMD than those who had mABQ1 fractures. Both GSQ1 to GSQ3 and mABQ1 to mABQ3 fractures were associated with higher risk of vertebral fracture. The subjects with mABQ1 to mABQ3 fractures had higher risk of vertebral fracture not only compared to those having GSQ1 to GSQ3 fractures, but also compared to those having GSQ2 to GSQ3 ones. In individuals with GSQ1 fracture, the risk of vertebral fracture was higher compared with subjects without vertebral fracture, but lower compared to those with mABQ1 fractures. Similarly, GSQ1 to GSQ3 and mABQ1 to mABQ3 fractures were associated with higher risk of nonvertebral major osteoporotic fracture compared to no vertebral fracture. The individuals with mABQ1 to mABQ3 fractures had higher fracture risk compared to those having GSQ1 to GSQ3 fractures, but not compared to those with GSQ2 to GSQ3 fractures. Individuals with mABQ1 fractures had higher risk of nonvertebral major osteoporotic fracture compared both with those who had no vertebral fracture and those who only had GSQ1 fractures. By contrast, GSQ1 fractures were not associated with risk of nonvertebral major osteoporotic fracture. Lentle and colleagues30 show that the GSQ1 fractures remain the main problem in the diagnostics of the vertebral fracture. Diagnostic agreement for GSQ1 is poorer than for more severe fractures as regards interrater and intrarater analyses, even for an experienced reader.17, 18 GSQ1 fractures are characterized by mild height reduction, which is all the more difficult to assess because the thoracic and the lumbar vertebral bodies differ in shape. Their assessment is subjective, especially, if we do not assess other suggestive traits of fracture; eg, endplate depression. A large proportion of GSQ fractures is not considered to be fractures according to the mABQ criteria. Because GSQ1 constitute 50% of all GSQ fractures, the difference in such prevalence may largely depend on GSQ1 fractures. This discrepancy suggests that a part of GSQ fractures (mainly GSQ1) may not be fractures, but atypical normal vertebrae or deformities related to other diseases. The authors do not mention which type of fractures (wedge, biconcave, or crush) caused most difficulties. Most often, the problem concerns wedge fractures because vertebral bodies on the thoracic kyphosis are physiologically wedged and because several diseases of the spine are associated with the wedging of the vertebral bodies; eg, senile spondylosis. Lentle and colleagues30 discuss the discordance between GSQ2 and mABQ2 and its potential impact on their relations with BMD. Morphometry shows that, in the case of physiologically normal thoracic vertebral bodies, the difference of 25% between the anterior and the posterior heights may correspond to 2.5SD to 3.5SD below the average height ratio in the healthy controls.19, 29, 31 Therefore, the adequate scoring of the moderate wedge fracture on the thoracic kyphosis is difficult. GSQ2 without endplate depression on the thoracic kyphosis may capture mild deformities (corresponding to GSQ1) or even extreme cases of normal vertebrae. The discrepancy between GSQ and mABQ fractures is determined by moderate and mild fractures. mABQ fractures are associated with higher risk of a new fracture than the GSQ ones. Thus, mABQ fractures may reflect poorer bone strength. However, a GSQ2 fracture without endplate depression may be an old mABQ2 fracture with remodeled endplate. This is in line with the data showing that the risk of a subsequent fracture is highest immediately after an osteoporotic fracture and then falls with time.32 These articles shed new light on the diagnostics of the vertebral fractures in the context of the previous data. The morphological sign pathognomonic for a vertebral fracture (reduced height, endplate depression) is not identified. Thus, it is difficult to choose the approach which should be used for practical purposes. We cannot define the variable to be measured for the diagnosis of the vertebral fracture (such as BMD for osteoporosis) and we cannot define the threshold value for the diagnosis of the vertebral fracture (such as T-score < –2.5 for osteoporosis). The vertebral fracture prevalence varied depending on the tool and the agreement between them was poor. Thus, “fractures” identified by each tool are qualitatively different. Because there is no gold standard, it is not possible to conclude on the differences in the fracture prevalence and on the lack of agreement: poor sensitivity of one method or poor specificity of the other. Indeed, the GSQ1 fracture is associated with slightly lower BMD and slightly higher risk of vertebral fracture, but not with the risk of clinical nonvertebral major osteoporotic fracture. Basically, all the algorithms above pose two problems. They are focused on vertebral fractures, whereas the assessment of vertebrae should account for all pathologies, including malignancy. They assess the vertebrae on the per vertebra basis. They omit the assessment of the overall aspect of the spine. Most of the diseases of the spine are multilevel (Scheuermann's disease, osteoarthritis, spondylosis, scoliosis). Thus, the initial assessment of the entire spine helps in the differentiating diagnostics of vertebral deformities. A patient with multilevel spine disease may have wedged vertebral bodies, which correspond to wedge fractures according to various diagnostic criteria (Fig. 1A). By contrast, an isolated anterior wedging of vertebral body between vertebrae of normal shape is suggestive of vertebral fracture (Fig. 1B). Therefore, the general view of the spine helps to establish the diagnosis. Obviously, the presence of other does not preclude the coexistence of vertebral fracture, but the understanding of their presence helps to interpret various vertebral deformities. Vertebral fracture prevalence is the highest on thoracic kyphosis and thoracolumbar junction.19, 20, 22 It may be due to the higher mechanical strain in these regions. This trend was confirmed by morphometry which compares the height ratios with the level-specific and ratio-specific reference values.18, 22 However, anterior wedging is typical in vertebral fracture. Because vertebrae on the thoracic kyphosis are slightly wedged with lower anterior height, their visual assessment is associated with a higher risk of diagnosis of a false positive in this region. The vertebral bodies which are far from the center of the X-ray are visualized by the oblique ray, which may provide a deformed image of the vertebral body. First, double contour of the superior and inferior endplates gives the biconcave aspect. Second, double contour of the uncinate processes gives a false impression of a very long posterior height. This is the case of the thoracolumbar junction, which is on the upper edge of the X-ray of the lumbar spine and on the lower edge of the X-ray of the thoracic spine. Again, the risk of diagnosis of a false-positive wedge deformity in this region is high (even by vertebral morphometry). Lumbar vertebral bodies are trapezoid-like with the anterior height longer than the posterior one. In this case, even the real wedge fracture may not display the typical shape and will not be seen as a fracture. Thus, in this region, the risk of a false negative is higher. Therefore, the highest prevalence of vertebral fractures on the thoracic kyphosis and on the thoracolumbar junction may reflect the different distribution of false positives and false negatives. From the conceptual point of view, there is no descriptive definition of the vertebral fracture. We cannot identify define the variable that would provide the main criterion for the diagnosis of a vertebral fracture. The vertebral fractures diagnosed using different approaches may reflect qualitatively different entities, but we do not dispose of convincing and unequivocal external criterion (i.e. vertebral fracture diagnosed using another “gold standard“ method) permitting to judge on their sensitivity and specificity. However, this search for a common denominator of all vertebral fractures may lead us astray. It is hardly conceivable that one criterion is sensitive enough to capture all vertebral fractures. It is also hardly conceivable that this criterion may be specific enough to differentiate vertebral fractures from other diseases of the spine. In the CaMos study, nearly 25% of GSQ2 and GSQ3 fractures were scored “normal” by the mABQ score, when in fact, most of them were actual fractures. Thus, the endplate depression is not an indispensable sign of a vertebral fracture (Fig. 2A). In prospective studies, evident incident endplate fracture may be found despite the height reduction that does not attain 20% (Fig. 2B). Thus, the height reduction is not an indispensable sign of vertebral fracture either. Mild vertebral fractures pose the biggest problem. In epidemiological studies they are associated with lower BMD, poor bone microarchitecture, lower estimated bone strength and higher fracture risk.33, 34 Thus, they are indications for antiosteoporotic treatment. The situation is more difficult at an individual level. The intrarater and interrater agreement of their diagnosis is lower compared to more severe fractures, even for an experienced reader. In an epidemiological study, the results may depend on random sampling variation and statistical power. However, at an individual level, mild fractures are associated only with a slight increase in fracture risk. These patients should be followed and treated if necessary. However, attention should be focused on the more severe fractures associated with a highly elevated risk of fracture (including hip fracture), disability, and mortality. They are quite evident on X-rays and must be reported so that the patient be duly treated. The analysis of these two articles in the context of the existing studies provides a couple of suggestions that may be clinically useful. Endplate depression is highly specific for vertebral fracture provided that other diseases and technical problems have been excluded. Severe height reduction (GSQ3) seems to be a specific radiological sign of vertebral fracture. The moderate height reduction (GSQ2) covers a large spectrum of deformities which should be interpreted differently according to the vertebral level. Wedge deformities on the thoracic kyphosis require attention. If they are associated with an endplate depression, they are likely to be fractures; if not, the question is more difficult (see in the next paragraph). GSQ2 seem to reflect vertebral fractures for other cases; ie, other vertebral levels, other types of fracture. Mild fractures with endplate depression (mABQ1) should be recognized as vertebral fractures. Overall, the above vertebral deformities (after exclusion of other diseases) should be regarded as real vertebral fractures, hence requiring further diagnosis and treatment. By contrast, GSQ1 and GSQ2 fractures without endplate depression (especially on the thoracic kyphosis) remain a challenge and should be analyzed within global context. A GSQ2 fracture of Th7 with a 35% height reduction is more likely to be a real fracture (and should be handled as such) rather than in the case of a GSQ2 deformity with a milder height reduction; eg, <30%. The deformed vertebra should be carefully compared to its neighbors. A GSQ1 or GSQ2 fracture without clear endplate depression, which is distinctly different from its neighbors may be more indicative of higher fracture risk than if wedged or biconcave vertebrae are evident throughout much of the thoracic or lumbar spine. If patients’ results of other tests (eg, BMD) do not warrant antiresorptive therapy, it would be advisable to consider them to be at high risk of fracture and further tests should be carried out again in 1 to 2 years. The association of the GSQ3 fractures and of the mABQ fractures with low BMD and high risk of fracture are sufficiently strong to recommend the antiosteoporotic treatment in these patients. By contrast, additional research on the GSQ1 and GSQ2 fractures without endplate depression, particularly as regards their association with the risk of fracture, would be useful. Obviously, these suggestions still remain suggestions. Moreover, they concern only vertebral fractures. The clinical assessment of the spine radiography must account for various pathologies of the vertebrae, especially for possible malignancy. I have no conflict of interest as concerns this paper. I thank Professor John Schousboe and Professor Eric Orwoll for the permission of the reproduction of the radiographs from the Study of Osteoporotic Fractures in Men (MrOS). Authors’ roles: I take the responsibility for the content of this paper. No other author has been involved in writing of the manuscript.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call