Abstract

AbstractThe Global Data on Events, Location, and Tone (GDELT) is a real time large scale database of global human society for open research which monitors worlds broadcast, print, and web news, creating a free open platform for computing on the entire world’s media. In this work, we first describe a data crawler, which collects metadata of the GDELT database in real-time and stores them in a big data management system based on Elasticsearch, a popular and efficient search engine relying on the Lucene library. Then, by exploiting and engineering the detailed information of each news encoded in GDELT, we build indicators capturing investor’s emotions which are useful to analyse the sovereign bond market in Italy. By using regression analysis and by exploiting the power of Gradient Boosting models from machine learning, we find that the features extracted from GDELT improve the forecast of country government yield spread, relative that of a baseline regression where only conventional regressors are included. The improvement in the fitting is particularly relevant during the period government crisis in May-December 2018.

Highlights

  • IntroductionThe explosion in computation and information technology experienced in the past decade has made available vast amounts of data in various domains, that has been referred to as Big Data

  • Factors such as the creditworthiness, the sovereign bond liquidity risk, and global risk aversion have been identified as the main factors having an impact on government yield spreads [2,20]

  • Among popular machine learning approaches, Gradient Boosting machines have been shown to be successful in various forecasting problems in Economics and Finance

Read more

Summary

Introduction

The explosion in computation and information technology experienced in the past decade has made available vast amounts of data in various domains, that has been referred to as Big Data. In Economics and Finance in particular, tapping into these data brings research and business closer together, as data generated in ordinary economic activity can be used towards rapid-learning economic systems, continuously improving and personalizing models. In this context, the recent use of Data Science technologies for Economics and Finance is providing mutual benefits to both scientists and professionals, improving forecasting and nowcasting for several types of applications. The recent surge in the government yield spreads in countries within the Euro area has originated an intense debate about the determinants and sources of risk of sovereign spreads.

Related Work
About GDELT
Yield Spread
Big Data Management
Feature Engineering
Big Data Analytics
Experimental Analysis
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call