Abstract
Algorithmic feature learners provide high-dimensional vector representations for non-matrix structured data, like image or text collections. Low-dimensional projections derived from these representations, called embeddings, are often used to explore variation in these data. However, it is not clear how to assess the embedding uncertainty. We adapt methods developed for bootstrapping principal components analysis to the setting where features are algorithmically derived from nonmatrix data. We empirically compare the derived confidence areas in simulations, varying factors influencing feature learning and the bootstrap, like feature learning algorithm complexity and bootstrap sample size. We illustrate the proposed approaches on a spatial proteomics dataset, where we observe that embedding precision is not uniform across all tissue types. Code, data, and pretrained models are available as an R compendium in the supplementary materials. Supplementary files for this article are available online.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.