Inexpensive Software Research Articles

Numerical Palaeobiology—Computer-based Modelling and Analysis of Fossils and their Distributions, David A.T. Harper (Editor), 1999, John Wiley & Sons, Chichester, 468 p. (Hardcover $99.00) ISBN: 0-471-97405-6. The 1990's saw the proliferation of inexpensive, fast computers and software packages for manipulating large data bases and conducting statistical, phylogenetic, and morphometric analyses. Projects that formerly required formidable programming skills and long (and expensive) amounts of computer time could be finished overnight or even nearly instantaneously by rank technopeasants. This has encouraged quantitative analyses by paleobiologists (indeed, by biologists in general) and, perhaps, also has encouraged more sophisticated statistical testing and inferring of hypotheses. However, the most recent volume dedicated to quantitative paleontology was Gilinsky and Signor, 1991's “ Analytical Paleobiology. ” Given the subsequent advances in hardware, software, and methodology, Harper's volume represents a timely release. Like Gilinsky and Signor's edited volume, this is a multi-authored tome touching on numerous subjects pertinent to morphologic evolution, paleoecology, and biostratigraphy. Unlike Gilinsky and Signor's volume, not only are methods explained and detailed with numerous examples, but software packages (including both commercial and privately published ones) are detailed also. The first five chapters concern analyses of morphology. Harper and Owen's “Quantitative and morphometric methods in taxonomy” covers a wide range of topics. This is perhaps to the chapter's detriment, as it reads almost like an abbreviated modern version of Sneath and Sokal (1973) tailored for paleontologists. The chapter's focus is twofold: techniques for distinguishing among species given multiple fossil “populations,” and methods for summarizing broad patterns among species (e.g., morphologic disparity). In doing so, the chapter provides overviews of “traditional” morphometrics (e.g., linear measures of a priori characters), geometric morphometrics (i.e., analyses of landmark distributions) and outline analyses (e.g., eigenshape and Fourier analyses), brief summaries of several multivariate methods, as well as synopses on …

Read full abstract

Introduction There is strong evidence (e.g., Laudon, 1986; Morey, 1982; Redman, 1992, 1995, 1996) that data stored in organizational databases have a significant rate of errors. The effect of data errors on the outputs of computer-based models has been investigated by a number of researchers (e.g., Ballou and Pazer, 1985; Ballou et al., 1987; Bansal et al., 1993). This investigation builds on this prior research by examining the effect of data quality on linear regression models. A financial application of a linear regression model is used to examine this question. Data errors may affect the predictive accuracy of linear regression models in two ways. First, the training data used to build the model may contain errors. Second, even if training data are free of errors, once a linear regression model is used for forecasting a user may input test data containing errors to the model. In general, when claims about the predictive accuracy of linear regression models are made, it is assumed that data used to train the models and data input to make predictions are free of errors. In this study we relax this assumption by asking two questions: (1) What is the effect of errors in test data on predictions made using linear regression models? and (2) What is the effect of errors in training data on predictions made using linear regression models? The first question is focused on the effect of data errors when the model is used for forecasting. The second question is focused on the effect of data errors during model construction. An understanding of the effect of data errors on linear regression models is particularly important because the availability of inexpensive software packages for personal computers makes the development and use of linear regression models by end-users feasible. Researchers have argued that end-user computing has increased the potential for data errors in computer applications (Boockholdt, 1989). As end users develop applications, it is possible that fewer data validation methods such as logic tests and control totals will be in place and it is likely that less rigorous testing will occur before applications are used in production (Corman, 1988; Davis, 1984; Davis et al., 1983; Panko, 1998). The remaining sections of this paper present (1) a review of relevant prior research on data quality, (2) a brief explanation of linear regression models, (3) a description of the linear regression models constructed in the study, (4) a discussion of the methodology of two experiments, (5) the results of two experiments and (6) conclusions. Background Data quality is generally recognized as a multidimensional concept (Wand and Wang, 1996; Wang and Strong, 1996). While no single definition of data quality has been accepted by researchers working in this area, there is agreement that data accuracy, currency, completeness, and consistency are important areas of concern (Agmon and Ahituv, 1987; Ballou and Pazer, 1985; Davis and Olson, 1985; Fox et al., 1993; Huh et al., 1990; Madnick and Wang, 1992; Wand and Wang, 1996; Wang and Strong, 1996; Zmud, 1978). This investigation adopts the conceptualization of data quality proposed by Ballou and Pazer (1985) that includes four dimensions: accuracy, timeliness, completeness, and consistency. This study is primarily concerned with data accuracy, defined as conformity between a recorded data value and the corresponding actual data value. Prior research has found that organizational databases are not in general free of errors (e.g., Laudon, 1986; Morey, 1982; Redman, 1992, 1995). Between one and twenty percent of data items in critical organizational databases are estimated to be inaccurate (Laudon, 1986; Madnick and Wang, 1992; Morey, 1982; Redman, 1992). Data quality problems have been found to affect the accuracy and timeliness of economic data published by the United States government (Hershey, 1995; Morgenstern, 1963). …

Read full abstract

Inexpensive Software Research Articles

Related Topics

Articles published on Inexpensive Software

A PC with sound card as an audio waveform generator, a two-channel digital oscilloscope and a spectrum analyzer

Automating Yapi Kredi Bank archives – a case study

Numerical Palaeobiology--Computer-based Modelling and Analysis of Fossils and their Distributions

The asteroseismology metacomputer

Virtualized architectural heritage: new tools and techniques

Delineating a Study Area Within a Rectangular Data Set: A Class Exercise

Transforming still images

Data Quality in Linear Regression Models: Effect of Errors in Test Data and Errors in Training Data on Predictive Accuracy

A re-engineering process using early decomposition and simple tools

Low Cost Digital Image Photogrammetry

A computer construction project that is both educationally sound and can be completed in one semester

Interactive Dynamic Graphics for Exploratory Survival Analysis

Technology Tips: Entrance Ramps to the Information Superhighway!

Assessment of Diabetes Care by Medical Record Review: The Indian Health Service Model

Integrating a personal-computer local-area network with a radiology information system: value as a tool for clinical research.

The Age of Optimization: Solving Large-Scale Real-World Problems

An inexpensive and effective basis for monitoring rice areas using GIS and remote sensing

Synthesis and Characterization of a Series of 5H-Benzo[a]phenoxazin-5-one Derivatives as Potential Antiviral/Antitumor Agents

Research on elastic buckling of columns, beam and plates: Focussing on formulas and design charts

The Use of Nonmarine Palynomorphs as Correlation Tools in Rapidly Deposited Upper Tertiary Sediments of the Gulf of Mexico

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Inexpensive Software Research Articles

Related Topics

Articles published on Inexpensive Software

A PC with sound card as an audio waveform generator, a two-channel digital oscilloscope and a spectrum analyzer

Automating Yapi Kredi Bank archives – a case study

Numerical Palaeobiology--Computer-based Modelling and Analysis of Fossils and their Distributions

The asteroseismology metacomputer

Virtualized architectural heritage: new tools and techniques

Delineating a Study Area Within a Rectangular Data Set: A Class Exercise

Transforming still images

Data Quality in Linear Regression Models: Effect of Errors in Test Data and Errors in Training Data on Predictive Accuracy

A re-engineering process using early decomposition and simple tools

Low Cost Digital Image Photogrammetry

A computer construction project that is both educationally sound and can be completed in one semester

Interactive Dynamic Graphics for Exploratory Survival Analysis

Technology Tips: Entrance Ramps to the Information Superhighway!

Assessment of Diabetes Care by Medical Record Review: The Indian Health Service Model

Integrating a personal-computer local-area network with a radiology information system: value as a tool for clinical research.

The Age of Optimization: Solving Large-Scale Real-World Problems

An inexpensive and effective basis for monitoring rice areas using GIS and remote sensing

Synthesis and Characterization of a Series of 5H-Benzo[a]phenoxazin-5-one Derivatives as Potential Antiviral/Antitumor Agents

Research on elastic buckling of columns, beam and plates: Focussing on formulas and design charts

The Use of Nonmarine Palynomorphs as Correlation Tools in Rapidly Deposited Upper Tertiary Sediments of the Gulf of Mexico