Predicting and understanding law-making with word vectors and an ensemble model.

John J Nay

doi:10.1371/journal.pone.0176999

Abstract

Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill’s sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment.

Highlights

The U.S legislative branch creates laws that impact the lives of hundreds of millions of citizens
Five models are compared across the two time conditions. w2v is the scoring of full bill text with an inversion of word2vec-learned language representations [11]
GLM is a regularized non-negative generalized linear model (GLM) meta-learner over an ensemble of a regularized GLM, a gradient boosted machine and a random forest, which each use only the contextual variables. w2vGLM is the same as GLM except the w2v and w2vTitle predictions are added as two more predictor variables for the three base learners

Summary

Introduction

The U.S legislative branch creates laws that impact the lives of hundreds of millions of citizens. The Patient Protection and Affordable Care Act (ACA) significantly affected the health care industry and individuals’ health insurance coverage. Bills often consist of hundreds of pages of dense legal language. The ACA is more than 900 pages long. There are thousands of bills under consideration at any given time and only about 4% will become law. Length, and vast quantity of bills, a machine learning approach that leverages bill text is well-suited to forecast bill success and identify the important predictive variables. Despite rapid advancement of machine learning methods, it’s difficult to outperform naive forecasts of rare events because of inherent

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: May 10, 2017
Citations: 33	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Predicting and understanding law-making with word vectors and an ensemble model.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

워드 임베딩과 품사 태깅을 이용한 클래스 언어모델 연구
Euisok Chung ... Jeon-Gue Park
KIISE Transactions on Computing Practices | VOL. 22
Euisok Chung, et. al.Euisok Chung ... Jeon-Gue Park
15 Jul 2016
KIISE Transactions on Computing Practices | VOL. 22

A Recurrent Neural Network Language Model Based on Word Embedding
Shuaimin Li ... Jungang Xu
-
Shuaimin Li, et. al.Shuaimin Li ... Jungang Xu
01 Jan 2018
01 Jan 2018

Learned Text Representation for Amharic Information Retrieval and Natural Language Processing
Tilahun Yeshambel ... Josiane Mothe
Information | VOL. 14
Tilahun Yeshambel, et. al.Tilahun Yeshambel ... Josiane Mothe
20 Mar 2023
Information | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting and understanding law-making with word vectors and an ensemble model.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE