Abstract

Data frames are integral to R. They provide a standard format for passing data to model-fitting and plotting functions, and this standard makes it easier for experienced users to learn new functions that accept data as a single data frame. Still, many data sets do not easily fit into a single data frame; data sets in ecology with a so-called fourth- corner problem provide important examples. Manipulating such inherently multiple-table data using several data frames can result in long and difficult-to-read workflows. We introduce the R multitable package to provide new data storage objects called data.list objects, which extend the data.frame concept to explicitly multiple-table settings. Like data frames, data lists are lists of variables stored as vectors; what is new is that these vectors have dimension attributes that make accessing and manipulating them easier. As data.list objects can be coerced to data.frame objects, they can be used with all R functions that accept an object that is coercible to a data.frame.

Highlights

  • The standard data management paradigm in R is based on data.frame objects, which are two-dimensional data tables with rows and columns representing replicates and variables (R Development Core Team 2012)

  • It provides access to those details, which are required for effective analyzes and to develop new methods of analysis within the framework

  • As new methods are developed, researchers pass their data frames to new functions in much the same way they would pass them to older functions

Read more

Summary

Introduction

The standard data management paradigm in R is based on data.frame objects, which are two-dimensional data tables with rows and columns representing replicates (sometimes called objects) and variables (R Development Core Team 2012). Data frames and formulas are combined by passing them to functions that multitable: Multiple-Table Data in R species environmental variables abundance sites traits fourth corner. Several such methods developed in ecology, focusing on data with a fourth-corner problem (Doledec, Chessel, ter Braak, and Champely 1996; Legendre et al 1997; Dray and Legendre 2008; Pillar and Duarte 2010; Leibold, Economo, and Peres-Neto 2010; Ives and Helmus 2011) These methods do not apply to data sets that have other more complex multipletable data structures (e.g., the zooplankton communities in Lac Croche, which are described in Figure 2; Cantin, Beisner, Gunn, Prairie, and Winter 2011).

The structure of data lists
How data lists are made
Multiple-table concepts
Subscripting data lists
Assigning new values to variables in data lists
Creating transformed variables in data lists
Creating variables to identify replicates
Melting and recasting data lists
Coercing data lists to data frames data list data frame
Faster iterative coercion of data lists to data frames
Stream fish data
Marginal summaries
Data list visualization
Generalized linear model example
Randomization tests
Analyzing data lists with multivariate methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call