as.data.cube {data.cube} | R Documentation |
Build cube
as.data.cube(x, ...) ## Default S3 method: as.data.cube(x, ...) ## S3 method for class 'matrix' as.data.cube(x, na.rm = TRUE, ...) ## S3 method for class 'array' as.data.cube(x, na.rm = TRUE, ...) ## S3 method for class 'fact' as.data.cube(x, dimensions = list(), ...) ## S3 method for class 'list' as.data.cube(x, ...) ## S3 method for class 'data.table' as.data.cube(x, id.vars = key(x), measure.vars, fun.aggregate = sum, dims = id.vars, hierarchies = NULL, ..., dimensions, measures = NULL) ## S3 method for class 'cube' as.data.cube(x, hierarchies = NULL, ...)
x |
R object. |
na.rm |
logical, default TRUE, when FALSE the cube would store cross product of all dimension grain keys! |
dimensions |
list of dimension class objects. |
id.vars |
characater vector of foreign key columns. |
measure.vars |
characater vector of column names of metrics. |
fun.aggregate |
function, default to sum. |
dims |
character vector of dimension names |
hierarchies |
list of hierarchies nested in list of dimensions passed to as.dimension. |
measures |
|
... |
arguments passed to methods. |
data.cube class object.
library(data.table) set.seed(1L) dt = CJ(color = c("green","yellow","red"), year = 2011:2015, status = c("active","inactive","archived","removed"))[sample(30)] dt[, "value" := sample(4:7/2, nrow(dt), TRUE)] # from data.table dc = as.data.cube( x = dt, id.vars = c("color","year","status"), measure.vars = "value", hierarchies = sapply(c("color","year","status"), function(x) list(setNames(list(character()), x)), simplify=FALSE) ) str(dc) # multidimensional hierarchical data from fact and dimensions X = populate_star(N = 1e3) sales = X$fact$sales time = X$dims$time geography = X$dims$geography # define hierarchies time.hierarchies = list( # 2 hierarchies in time dimension "monthly" = list( "time_year" = character(), "time_quarter" = c("time_quarter_name"), "time_month" = c("time_month_name"), "time_date" = c("time_month","time_quarter","time_year") ), "weekly" = list( "time_year" = character(), "time_week" = character(), "time_date" = c("time_week","time_year") ) ) geog.hierarchies = list( # 1 hierarchy in geography dimension list( "geog_region_name" = character(), "geog_division_name" = c("geog_region_name"), "geog_abb" = c("geog_name","geog_division_name","geog_region_name") ) ) # create dimensions dims = list( time = as.dimension(x = time, id.vars = "time_date", hierarchies = time.hierarchies), geography = as.dimension(x = geography, id.vars = "geog_abb", hierarchies = geog.hierarchies) ) # create fact ff = as.fact( x = sales, id.vars = c("geog_abb","time_date"), measure.vars = c("amount","value"), fun.aggregate = sum, na.rm = TRUE ) # create data.cube dc = as.data.cube(ff, dims) str(dc)