Unnest a list-column of data frames into rows and columns

Unnest expands a list-column containing data frames into rows and columns.

# S3 method for class 'tidySpatialExperiment_nested'
unnest(
  data,
  cols,
  ...,
  keep_empty = FALSE,
  ptype = NULL,
  names_sep = NULL,
  names_repair = "check_unique",
  .drop,
  .id,
  .sep,
  .preserve
)

Arguments

data

A data frame.

cols

<tidy-select> List-columns to unnest.

When selecting multiple columns, values from the same row will be recycled to their common size.

...

: previously you could write df %>% unnest(x, y, z). Convert to df %>% unnest(c(x, y, z)). If you previously created a new variable in unnest() you'll now need to do it explicitly with mutate(). Convert df %>% unnest(y = fun(x, y, z)) to df %>% mutate(y = fun(x, y, z)) %>% unnest(y).

keep_empty

By default, you get one row of output for each element of the list that you are unchopping/unnesting. This means that if there's a size-0 element (like NULL or an empty data frame or vector), then that entire row will be dropped from the output. If you want to preserve all rows, use keep_empty = TRUE to replace size-0 elements with a single row of missing values.

ptype

Optionally, a named list of column name-prototype pairs to coerce cols to, overriding the default that will be guessed from combining the individual values. Alternatively, a single empty ptype can be supplied, which will be applied to all cols.

names_sep

If NULL, the default, the outer names will come from the inner names. If a string, the outer names will be formed by pasting together the outer and the inner column names, separated by names_sep.

names_repair

Used to check that output data frame has valid names. Must be one of the following options:

"minimal": no name repair or checks, beyond basic existence,
"unique": make sure names are unique and not empty,
"check_unique": (the default), no name repair, but check they are unique,
"universal": make the names unique and syntactic
a function: apply custom name repair.
tidyr_legacy: use the name repair from tidyr 0.8.
a formula: a purrr-style anonymous function (see rlang::as_function())

See vctrs::vec_as_names() for more details on these terms and the strategies used to enforce them.

.drop, .preserve

: all list-columns are now preserved; If there are any that you don't want in the output use select() to remove them prior to unnesting.

.id

: convert df %>% unnest(x, .id = "id") to df %>% mutate(id = names(x)) %>% unnest(x)).

.sep

: use names_sep instead.

Value

tidySpatialExperiment

New syntax

tidyr 1.0.0 introduced a new syntax for nest() and unnest() that's designed to be more similar to other functions. Converting to the new syntax should be straightforward (guided by the message you'll receive) but if you just need to run an old analysis, you can easily revert to the previous behaviour using nest_legacy() and unnest_legacy() as follows:

library(tidyr)
nest <- nest_legacy
unnest <- unnest_legacy

Examples

example(read10xVisium)
#> 
#> rd10xV> dir <- system.file(
#> rd10xV+   file.path("extdata", "10xVisium"), 
#> rd10xV+   package = "SpatialExperiment")
#> 
#> rd10xV> sample_ids <- c("section1", "section2")
#> 
#> rd10xV> samples <- file.path(dir, sample_ids, "outs")
#> 
#> rd10xV> list.files(samples[1])
#> [1] "raw_feature_bc_matrix" "spatial"              
#> 
#> rd10xV> list.files(file.path(samples[1], "spatial"))
#> [1] "scalefactors_json.json"    "tissue_lowres_image.png"  
#> [3] "tissue_positions_list.csv"
#> 
#> rd10xV> file.path(samples[1], "raw_feature_bc_matrix")
#> [1] "/home/runner/work/_temp/Library/SpatialExperiment/extdata/10xVisium/section1/outs/raw_feature_bc_matrix"
#> 
#> rd10xV> (spe <- read10xVisium(samples, sample_ids, 
#> rd10xV+   type = "sparse", data = "raw", 
#> rd10xV+   images = "lowres", load = FALSE))
#> Warning: 'read10xVisium' is deprecated.
#> Use 'VisiumIO::TENxVisium(List)' instead.
#> See help("Deprecated")
#> # A SpatialExperiment-tibble abstraction: 99 × 7
#> # Features = 50 | Cells = 99 | Assays = counts
#>    .cell              in_tissue array_row array_col sample_id pxl_col_in_fullres
#>    <chr>              <lgl>         <int>     <int> <chr>                  <int>
#>  1 AAACAACGAATAGTTC-1 FALSE             0        16 section1                2312
#>  2 AAACAAGTATCTCCCA-1 TRUE             50       102 section1                8230
#>  3 AAACAATCTACTAGCA-1 TRUE              3        43 section1                4170
#>  4 AAACACCAATAACTGC-1 TRUE             59        19 section1                2519
#>  5 AAACAGAGCGACTCCT-1 TRUE             14        94 section1                7679
#>  6 AAACAGCTTTCAGAAG-1 FALSE            43         9 section1                1831
#>  7 AAACAGGGTCTATATT-1 FALSE            47        13 section1                2106
#>  8 AAACAGTGTTCCTGGG-1 FALSE            73        43 section1                4170
#>  9 AAACATGGTGAGAGGA-1 FALSE            62         0 section1                1212
#> 10 AAACATTTCCCGGATT-1 FALSE            61        97 section1                7886
#> # ℹ 89 more rows
#> # ℹ 1 more variable: pxl_row_in_fullres <int>
#> 
#> rd10xV> # base directory 'outs/' from Space Ranger can also be omitted
#> rd10xV> samples2 <- file.path(dir, sample_ids)
#> 
#> rd10xV> (spe2 <- read10xVisium(samples2, sample_ids, 
#> rd10xV+   type = "sparse", data = "raw", 
#> rd10xV+   images = "lowres", load = FALSE))
#> Warning: 'read10xVisium' is deprecated.
#> Use 'VisiumIO::TENxVisium(List)' instead.
#> See help("Deprecated")
#> # A SpatialExperiment-tibble abstraction: 99 × 7
#> # Features = 50 | Cells = 99 | Assays = counts
#>    .cell              in_tissue array_row array_col sample_id pxl_col_in_fullres
#>    <chr>              <lgl>         <int>     <int> <chr>                  <int>
#>  1 AAACAACGAATAGTTC-1 FALSE             0        16 section1                2312
#>  2 AAACAAGTATCTCCCA-1 TRUE             50       102 section1                8230
#>  3 AAACAATCTACTAGCA-1 TRUE              3        43 section1                4170
#>  4 AAACACCAATAACTGC-1 TRUE             59        19 section1                2519
#>  5 AAACAGAGCGACTCCT-1 TRUE             14        94 section1                7679
#>  6 AAACAGCTTTCAGAAG-1 FALSE            43         9 section1                1831
#>  7 AAACAGGGTCTATATT-1 FALSE            47        13 section1                2106
#>  8 AAACAGTGTTCCTGGG-1 FALSE            73        43 section1                4170
#>  9 AAACATGGTGAGAGGA-1 FALSE            62         0 section1                1212
#> 10 AAACATTTCCCGGATT-1 FALSE            61        97 section1                7886
#> # ℹ 89 more rows
#> # ℹ 1 more variable: pxl_row_in_fullres <int>
#> 
#> rd10xV> # tabulate number of spots mapped to tissue
#> rd10xV> cd <- colData(spe)
#> 
#> rd10xV> table(
#> rd10xV+   in_tissue = cd$in_tissue, 
#> rd10xV+   sample_id = cd$sample_id)
#>          sample_id
#> in_tissue section1 section2
#>     FALSE       28       27
#>     TRUE        22       22
#> 
#> rd10xV> # view available images
#> rd10xV> imgData(spe)
#> DataFrame with 2 rows and 4 columns
#>     sample_id    image_id   data scaleFactor
#>   <character> <character> <list>   <numeric>
#> 1    section1      lowres   ####   0.0510334
#> 2    section2      lowres   ####   0.0510334
spe |>
    nest(data = -sample_id) |>
    unnest(data)
#> Warning: tidySingleCellExperiment says: you have duplicated cell names; they will be made unique.
#> # A SpatialExperiment-tibble abstraction: 99 × 7
#> # Features = 50 | Cells = 99 | Assays = counts
#>    .cell              in_tissue array_row array_col sample_id pxl_col_in_fullres
#>    <chr>              <lgl>         <int>     <int> <chr>                  <int>
#>  1 AAACAACGAATAGTTC-1 FALSE             0        16 section1                2312
#>  2 AAACAAGTATCTCCCA-1 TRUE             50       102 section1                8230
#>  3 AAACAATCTACTAGCA-1 TRUE              3        43 section1                4170
#>  4 AAACACCAATAACTGC-1 TRUE             59        19 section1                2519
#>  5 AAACAGAGCGACTCCT-1 TRUE             14        94 section1                7679
#>  6 AAACAGCTTTCAGAAG-1 FALSE            43         9 section1                1831
#>  7 AAACAGGGTCTATATT-1 FALSE            47        13 section1                2106
#>  8 AAACAGTGTTCCTGGG-1 FALSE            73        43 section1                4170
#>  9 AAACATGGTGAGAGGA-1 FALSE            62         0 section1                1212
#> 10 AAACATTTCCCGGATT-1 FALSE            61        97 section1                7886
#> # ℹ 89 more rows
#> # ℹ 1 more variable: pxl_row_in_fullres <int>

Unnest a list-column of data frames into rows and columns

Arguments

Value

New syntax

See also

Examples