summarise() creates a new data frame. It returns one row for each combination of grouping variables; if there are no grouping variables, the output will have a single row summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.

summarise() and summarize() are synonyms.

Value

An object usually of the same type as .data.

  • The rows come from the underlying group_keys().

  • The columns are a combination of the grouping keys and the summary expressions that you provide.

  • The grouping structure is controlled by the .groups= argument, the output may be another grouped_df, a tibble or a rowwise data frame.

  • Data frame attributes are not preserved, because summarise() fundamentally creates a new data frame.

Useful functions

Backend variations

The data frame backend supports creating a variable and using it in the same summary. This means that previously created summary variables can be further transformed or combined within the summary, as in mutate(). However, it also means that summary variables with the same names as previous variables overwrite them, making those variables unavailable to later summary variables.

This behaviour may not be supported in other backends. To avoid unexpected results, consider using new names for your summary variables, especially when creating multiple summaries.

Methods

This function is a generic, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages: dplyr (data.frame, grouped_df, rowwise_df), plotly (plotly), tidySingleCellExperiment (SingleCellExperiment) .

See also

Other single table verbs: arrange(), mutate(), rename(), slice()

Examples

example(read10xVisium)
#> 
#> rd10xV> dir <- system.file(
#> rd10xV+   file.path("extdata", "10xVisium"), 
#> rd10xV+   package = "SpatialExperiment")
#> 
#> rd10xV> sample_ids <- c("section1", "section2")
#> 
#> rd10xV> samples <- file.path(dir, sample_ids, "outs")
#> 
#> rd10xV> list.files(samples[1])
#> [1] "raw_feature_bc_matrix" "spatial"              
#> 
#> rd10xV> list.files(file.path(samples[1], "spatial"))
#> [1] "scalefactors_json.json"    "tissue_lowres_image.png"  
#> [3] "tissue_positions_list.csv"
#> 
#> rd10xV> file.path(samples[1], "raw_feature_bc_matrix")
#> [1] "/__w/_temp/Library/SpatialExperiment/extdata/10xVisium/section1/outs/raw_feature_bc_matrix"
#> 
#> rd10xV> (spe <- read10xVisium(samples, sample_ids, 
#> rd10xV+   type = "sparse", data = "raw", 
#> rd10xV+   images = "lowres", load = FALSE))
#> # A SpatialExperiment-tibble abstraction: 99 × 7
#> # Features = 50 | Cells = 99 | Assays = counts
#>    .cell              in_tissue array_row array_col sample_id pxl_col_in_fullres
#>    <chr>              <lgl>         <int>     <int> <chr>                  <int>
#>  1 AAACAACGAATAGTTC-1 FALSE             0        16 section1                2312
#>  2 AAACAAGTATCTCCCA-1 TRUE             50       102 section1                8230
#>  3 AAACAATCTACTAGCA-1 TRUE              3        43 section1                4170
#>  4 AAACACCAATAACTGC-1 TRUE             59        19 section1                2519
#>  5 AAACAGAGCGACTCCT-1 TRUE             14        94 section1                7679
#>  6 AAACAGCTTTCAGAAG-1 FALSE            43         9 section1                1831
#>  7 AAACAGGGTCTATATT-1 FALSE            47        13 section1                2106
#>  8 AAACAGTGTTCCTGGG-1 FALSE            73        43 section1                4170
#>  9 AAACATGGTGAGAGGA-1 FALSE            62         0 section1                1212
#> 10 AAACATTTCCCGGATT-1 FALSE            61        97 section1                7886
#> # ℹ 89 more rows
#> # ℹ 1 more variable: pxl_row_in_fullres <int>
#> 
#> rd10xV> # base directory 'outs/' from Space Ranger can also be omitted
#> rd10xV> samples2 <- file.path(dir, sample_ids)
#> 
#> rd10xV> (spe2 <- read10xVisium(samples2, sample_ids, 
#> rd10xV+   type = "sparse", data = "raw", 
#> rd10xV+   images = "lowres", load = FALSE))
#> # A SpatialExperiment-tibble abstraction: 99 × 7
#> # Features = 50 | Cells = 99 | Assays = counts
#>    .cell              in_tissue array_row array_col sample_id pxl_col_in_fullres
#>    <chr>              <lgl>         <int>     <int> <chr>                  <int>
#>  1 AAACAACGAATAGTTC-1 FALSE             0        16 section1                2312
#>  2 AAACAAGTATCTCCCA-1 TRUE             50       102 section1                8230
#>  3 AAACAATCTACTAGCA-1 TRUE              3        43 section1                4170
#>  4 AAACACCAATAACTGC-1 TRUE             59        19 section1                2519
#>  5 AAACAGAGCGACTCCT-1 TRUE             14        94 section1                7679
#>  6 AAACAGCTTTCAGAAG-1 FALSE            43         9 section1                1831
#>  7 AAACAGGGTCTATATT-1 FALSE            47        13 section1                2106
#>  8 AAACAGTGTTCCTGGG-1 FALSE            73        43 section1                4170
#>  9 AAACATGGTGAGAGGA-1 FALSE            62         0 section1                1212
#> 10 AAACATTTCCCGGATT-1 FALSE            61        97 section1                7886
#> # ℹ 89 more rows
#> # ℹ 1 more variable: pxl_row_in_fullres <int>
#> 
#> rd10xV> # tabulate number of spots mapped to tissue
#> rd10xV> cd <- colData(spe)
#> 
#> rd10xV> table(
#> rd10xV+   in_tissue = cd$in_tissue, 
#> rd10xV+   sample_id = cd$sample_id)
#>          sample_id
#> in_tissue section1 section2
#>     FALSE       28       27
#>     TRUE        22       22
#> 
#> rd10xV> # view available images
#> rd10xV> imgData(spe)
#> DataFrame with 2 rows and 4 columns
#>     sample_id    image_id   data scaleFactor
#>   <character> <character> <list>   <numeric>
#> 1    section1      lowres   ####   0.0510334
#> 2    section2      lowres   ####   0.0510334
spe |>
    summarise(mean(array_row))
#> tidySingleCellExperiment says: A data frame is returned for independent data analysis.
#> # A tibble: 1 × 1
#>   `mean(array_row)`
#>               <dbl>
#> 1              38.1