I'm having trouble rearranging the following data frame:
set.seed(45) dat1 <- data.frame( name = rep(c("firstName", "secondName"), each=4), numbers = rep(1:4, 2), value = rnorm(8) ) dat1 name numbers value 1 firstName 1 0.3407997 2 firstName 2 -0.7033403 3 firstName 3 -0.3795377 4 firstName 4 -0.7460474 5 secondName 1 -0.8981073 6 secondName 2 -0.3347941 7 secondName 3 -0.5013782 8 secondName 4 -0.1745357
I want to reshape it so that each unique "name" variable is a rowname, with the "values" as observations along that row and the "numbers" as colnames. Sort of like this:
name 1 2 3 4 1 firstName 0.3407997 -0.7033403 -0.3795377 -0.7460474 5 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357
I've looked at
cast and a few other things, but none seem to do the job.
There's very powerful new package from genius data scientists at Win-Vector (folks that made
cdata. It implements "coordinated data" principles described in this document and also in this blog post. The idea is that regardless how you organize your data, it should be possible to identify individual data points using a system of "data coordinates". Here's a excerpt from the recent blog post by John Mount:
The whole system is based on two primitives or operators cdata::moveValuesToRowsD() and cdata::moveValuesToColumnsD(). These operators have pivot, un-pivot, one-hot encode, transpose, moving multiple rows and columns, and many other transforms as simple special cases.
It is easy to write many different operations in terms of the cdata primitives. These operators can work-in memory or at big data scale (with databases and Apache Spark; for big data use the cdata::moveValuesToRowsN() and cdata::moveValuesToColumnsN() variants). The transforms are controlled by a control table that itself is a diagram of (or picture of) the transform.
We will first build the control table (see blog post for details) and then perform the move of data from rows to columns.
library(cdata) # first build the control table pivotControlTable <- buildPivotControlTableD(table = dat1, # reference to dataset columnToTakeKeysFrom = 'numbers', # this will become column headers columnToTakeValuesFrom = 'value', # this contains data sep="_") # optional for making column names # perform the move of data to columns dat_wide <- moveValuesToColumnsD(tallTable = dat1, # reference to dataset keyColumns = c('name'), # this(these) column(s) should stay untouched controlTable = pivotControlTable# control table above ) dat_wide #> name numbers_1 numbers_2 numbers_3 numbers_4 #> 1 firstName 0.3407997 -0.7033403 -0.3795377 -0.7460474 #> 2 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357