as.data.cube {data.cube}R Documentation

Build cube

Description

Build cube

Usage

as.data.cube(x, ...)

## Default S3 method:
as.data.cube(x, ...)

## S3 method for class 'matrix'
as.data.cube(x, na.rm = TRUE, ...)

## S3 method for class 'array'
as.data.cube(x, na.rm = TRUE, ...)

## S3 method for class 'fact'
as.data.cube(x, dimensions = list(), ...)

## S3 method for class 'list'
as.data.cube(x, ...)

## S3 method for class 'data.table'
as.data.cube(x, id.vars = key(x), measure.vars,
  fun.aggregate = sum, dims = id.vars, hierarchies = NULL, ...,
  dimensions, measures = NULL)

## S3 method for class 'cube'
as.data.cube(x, hierarchies = NULL, ...)

Arguments

x

R object.

na.rm

logical, default TRUE, when FALSE the cube would store cross product of all dimension grain keys!

dimensions

list of dimension class objects.

id.vars

characater vector of foreign key columns.

measure.vars

characater vector of column names of metrics.

fun.aggregate

function, default to sum.

dims

character vector of dimension names

hierarchies

list of hierarchies nested in list of dimensions passed to as.dimension.

measures

list of measure class objects passed to as.fact.

...

arguments passed to methods.

Value

data.cube class object.

See Also

data.cube, fact, dimension

Examples

library(data.table)
set.seed(1L)
dt = CJ(color = c("green","yellow","red"),
        year = 2011:2015,
        status = c("active","inactive","archived","removed"))[sample(30)]
dt[, "value" := sample(4:7/2, nrow(dt), TRUE)]

# from data.table
dc = as.data.cube(
    x = dt, id.vars = c("color","year","status"),
    measure.vars = "value",
    hierarchies = sapply(c("color","year","status"),
                         function(x) list(setNames(list(character()), x)),
                         simplify=FALSE)
)
str(dc)

# multidimensional hierarchical data from fact and dimensions
X = populate_star(N = 1e3)
sales = X$fact$sales
time = X$dims$time
geography = X$dims$geography
# define hierarchies
time.hierarchies = list( # 2 hierarchies in time dimension
    "monthly" = list(
        "time_year" = character(),
        "time_quarter" = c("time_quarter_name"),
        "time_month" = c("time_month_name"),
        "time_date" = c("time_month","time_quarter","time_year")
    ),
    "weekly" = list(
        "time_year" = character(),
        "time_week" = character(),
        "time_date" = c("time_week","time_year")
    )
)
geog.hierarchies = list( # 1 hierarchy in geography dimension
    list(
        "geog_region_name" = character(),
        "geog_division_name" = c("geog_region_name"),
        "geog_abb" = c("geog_name","geog_division_name","geog_region_name")
    )
)
# create dimensions
dims = list(
    time = as.dimension(x = time,
                        id.vars = "time_date",
                        hierarchies = time.hierarchies),
    geography = as.dimension(x = geography,
                             id.vars = "geog_abb",
                             hierarchies = geog.hierarchies)
)
# create fact
ff = as.fact(
    x = sales,
    id.vars = c("geog_abb","time_date"),
    measure.vars = c("amount","value"),
    fun.aggregate = sum,
    na.rm = TRUE
)
# create data.cube
dc = as.data.cube(ff, dims)
str(dc)

[Package data.cube version 0.4.0 Index]