Abstract

ABSTRACT Climate change is intensifying natural hazards, putting critical infrastructure systems at risk. The effects of climate change on critical infrastructure can be significant, and communities need to consider these risks when planning and designing infrastructure systems for the future. To that end, natural language processing (NLP) is a promising approach for analyzing large volumes of climate change and infrastructure-related scientific literature. To train a supervised model using NLP techniques, a significant subset of the corpus must be labeled into categories based on user-defined criteria, which is a time-consuming process. To expedite this process, we developed a weak supervision-based approach that leverages semantic similarity between categories and documents to generate category labels for the domain-specific corpus. In comparison with a months-long process of subject-matter expert labeling, we assign category labels to the whole corpus using weak supervision and supervised learning in 13 hours.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call