Abstract

Binary, count, and duration data all code discrete events occurring at points in time. Although a single data generation process can produce all of these three data types, the statistical literature is not very helpful in providing methods to estimate parameters of the same process from each. In fact, only a single theoretical process exists for which known statistical methods can estimate the same parameters—and it is generally used only for count and duration data. The result is that seemingly trivial decisions about which level of data to use can have important consequences for substantive interpretations. We describe the theoretical event process for which results exist, based on time independence. We also derive a set of models for a time-dependent process and compare their predictions to those of a commonly used model. Any hope of understanding and avoiding the more serious problems of aggregation bias in events data is contingent on first deriving a much wider arsenal of statistical models and theoretical processes that are not constrained by the particular forms of data that happen to be available. We discuss these issues and suggest an agenda for political methodologists interested in this very large class of aggregation problems.

Highlights

  • Data in many disciplines are often coded from specific events, and a well-developed methodological literature has emerged to deal with such data

  • Binary data are often the finest-grained, resulting either when count time slices are reduced to such an extent that at most one event occurs in any observation or when count data is “censored” to zero and one

  • We do not want the form in which the data happen to be collected to determine the substantive ideas which we can explore

Read more

Summary

Introduction

Data in many disciplines are often coded from specific events, and a well-developed methodological literature has emerged to deal with such data. The statistical literature does not generally provide ways of comparing results across these different models This should be quite frustrating to scholars, since binary, count, and duration data are all coded from precisely the same underlying events.. Our primary goal in this paper is to demonstrate how to compare the results obtained from binary, count, and duration models of the same underlying data generation process. First should come models, such as those provided in this paper, which at least under certain specific assumptions are able to estimate the same parameters no matter what level of analysis or type of aggregation produced the available data. Developing models that can avoid aggregation bias in events process models, and in other areas, will require a second difficult set of developments These developments can only occur after, or at least concommitant with, the first.

Transfers of Governmental Power as a Renewal Process
Analyzing Time-Independent Renewal Processes
Duration Data
Count Data
Disaggregated and “Binary” Data
An Example Using Simulated Time-Independent Data
Analyzing Time-Dependent Renewal Processes
Disaggregated Data
An Example Using Simulated Time-Dependent Data
Concluding Remarks

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.