Fossil plant remains are commonly found in fragments in the sediment, thus complicating the reconstruction and classification of fossil plants into a higher taxonomic group. Particularly for stem anatomy, some described features repeat among the proposed lineages due to environmental pressures that induce anatomical convergence. Other characteristics cannot always be seen because of the fossil's state of preservation, as often happens with the bark and the arrangement of axes and leaves. Given these difficulties, we developed PaleoWood, an unprecedented affinity classifier for Paleozoic gymnosperm woods based on 16 variables collected from 42 consistent genera that have the central core, primary xylem, and secondary xylem described. Similarities among samples were analyzed by principal coordinates, and models were trained through logistic regression, linear discriminant, and k-nearest neighbors algorithms. Models' performance was estimated by cross-validation and testing of the affinity of 20 previously known samples. Results agreed with some hypotheses previously discussed in the literature, such as the linkage of Eristophyton, Megaloxylon, and Tetrastichia with Lyginopteridales. Some other predictions were interpreted to be a result of convergent evolution or the models' limitations, especially those predictions relating to the samples of simple protostele or pycnoxylic pteridosperms (but these models are not definitive and may be improved as new data are collected). Therefore, they could assist in future comparisons and discussions about the taxonomy, evolution, and paleobotanical affinities of the basal seed plants, especially for the woods from Gondwana, in which affinities are obscure for several genera.
Read full abstract