1. Lost in the woods: The value of tree ensemble modelling for adult age-at-death estimation from skeletal degeneration
- Author
-
David Navega, Ernesto Costa, and Eugénia Cunha
- Subjects
Estimation ,Tree (data structure) ,medicine.anatomical_structure ,Appendicular skeleton ,Computer science ,Eburnation ,Statistics ,medicine ,Decision tree ,Statistical model ,Bone age ,Degeneration (medical) ,Pathology and Forensic Medicine - Abstract
Accurate and precise age estimation is of paramount importance in forensic analysis of human remains. Despite several research efforts most commonly applied techniques provide poor age estimates. The low predictive power of macroscopic skeletal age estimation methods can be attributed to several factors such as the inappropriate statistical modelling, poorly sampled reference material, and an over reliance on an indicator-specific approach. The objective of this pilot study is to illustrate a general approach to age-at-death estimation from skeletal degeneration of major articulations of the appendicular skeleton, cervical and lumbar spine. A sample of 256 adult male individuals with known age-at-death was constructed from the identified skeletal collections held at the University of Coimbra. Articular degeneration of the appendicular skeleton was mapped by analysing marginal lipping, bone formation, porosity and eburnation of bone surfaces. Vertebral degeneration was analysed by scoring marginal lipping and porosity of body surfaces. To tackle the complexity of age-related skeletal degeneration, a machine learning approach is proposed around decision trees forests. Decision trees offer a flexible tool for age-at-death estimation as they can map regression functions without assuming a structural form for the predictive surface. Individual tree ensembles were created for each articulation and spinal segment, forming what is called here skeletal aging woods. To solve the issue of combining age prediction obtained from different aging woods, a final and larger forest is trained on the cross-validation predictions of the woods. From individual aging woods, older individuals tend to be accurately identified but age predictions for young individuals tend to be very broad and imprecise. However, prediction error from the final forest can be as low as 7.59 years, with the major errors more broadly concentrated on middle-aged individuals. Tools for visualizing age-related subspaces spanned by the proximity matrix of tree ensembles are also presented.
- Published
- 2017
- Full Text
- View/download PDF