Abstract The building sector plays an important role in achieving the climate objectives of the EU Green Deal. While prioritizing measures to reduce operational energy and GHG emissions has proven beneficial, it has shifted burdens by increasing embodied emissions. Quantifying and regulating emissions throughout the entire building life cycle is therefore crucial. An ongoing DG GROW project is investigating strategies to reduce life cycle GHG emissions within the EU. Various steps are being carried out to achieve the research goals: identification of data needs and sources, baseline analysis of the existing whole life carbon emissions of the EU building stock, modelling of future scenarios. This paper elaborates on the building stock characterization, demonstrating innovation through its level of granularity. Firstly, key data sources are chosen to provide the desired granularity. Secondly, archetypes are defined based on the data sources. Thirdly, attributes are chosen to describe the building stock in terms of geometry, building element composition, energy use, etc. The paper concludes by discussing challenges related to collecting attribute information and managing data gaps. The insights derived offer valuable recommendations for establishing a future data repository dedicated to environmental LCA of the EU building stock.