# Grouping data and descriptive statistics

## How to set statistical aggregators for a single column

Aggregators provide summary statistics for grouped rows.

Command | Operation |
---|---|

`+` aggregator |
adds aggregator to current column |

`z+` aggregator |
displays result of aggregator over values in selected rows for current column |

The following statistical aggregators are available:

Aggregator | Description |
---|---|

`min` |
smallest value in the group |

`max` |
largest value in the group |

`avg` /`mean` |
average value of the group |

`mode` |
most frequently appearing value in group |

`median` |
median value in the group |

`q3/q4/q5/q10` |
add quantile aggregators to group (e.g. q4 adds p25, p50, p75) |

`sum` |
total summation of all numbers in the group |

`distinct` |
number of distinct values in the group |

`count` |
number of values in the group |

`keymax` |
key of the row with the largest value in the group |

`list` |
gathers values in column into a list |

`stdev` |
standard deviation of values |

The follow howtos will have examples of workflows involving grouping of data and statistical aggregation.

The following examples use the file sample.tsv.

## How to group data (frequency table, pivot table, describe table,)

## How to make a pivot table frequency table, pivot table, describe table,)

- Move to the column A with the independent variable, and press
`!`

to mark it as a key column to group by. - Move to a numeric column B, and press
`+`

to add an*aggregator*to that column, so that the aggregator will be applied to each group of values. - Make sure column B has a numeric type: int (
`#`

), float (`%`

), currency (`$`

), or date (`@`

). - Move to the column C with the dependent categorical variable to be pivoted, and press
`Shift+W`

(`pivot`

) to pivot on that column. The pivoted sheet then will have a column for each distinct value in the source column.

## How to create a frequency chart

### How to make a histogram

**Question** How many of each **Item** were sold?

- Move the cursor to the
**Item**column. - Press
`Shift+F`

to open the**Frequency table**.

### How to use the Frequency table to view the results of statistical aggregation

**Question** What was the monthly revenue?

- On the
**OrderDate**column, type`;`

followed by`(\d+-\d+)`

to create a column with only the year and the month using regex capture groups. - Press
`-`

to hide the**OrderDate**column. - On the **OrderDate_re0** column, type
`^`

followed by`OrderDate`

to rename it. - On the
**Total**column, press`$`

to set its type to currency. - Type
`+`

followed by`sum`

to add a statistical aggregator to**Total**. - On the
**OrderDate**column type`Shift+F`

to open the**Frequency table**. - On the
**OrderDate**column, press`[`

to sort the table in chrononological order. - On the **sum_Total** column, type
`^`

followed by`Revenue`

to rename the column.

### How to calculate some descriptive statistics

- Press
`Shift+I`

to open the**Describe sheet**.

### How to filter for grouped rows

- Press
`Shift+F`

to open the**Frequency table**. - Press
`s`

or`t`

on the groups you are interested in to select those entries in the source sheet. - Press
`q`

or`Ctrl+^`

to return to the source sheet. - Press
`"`

to open a duplicate sheet with selected rows.

**or**

- Press
`Shift+F`

to open the**Frequency table**. - Press
`Enter`

on the grouping you are interested in to open a sheet of the source rows that are part of that group.

### How to filter for described rows

- Press
`Shift+I`

to open the**Describe sheet**. - Use
`zs`

to select rows on source sheet which are being described in the cells of interest. - Press
`q`

or`Ctrl+^`

to return to the source sheet. - Press
`"`

to open a duplicate sheet with selected rows.

**or**

- Press
`Shift+I`

to open the**Describe sheet**. - Press
`zEnter`

open copy of source sheet with rows being described in the current cell of interest.

### How to filter for the rows aggregated in a pivot table

`Enter`

/`zEnter`

can both be used in the **Pivot table** to open a sheet of the source rows which are aggregated in the current pivot row/cell.

### How to combine cells in one for column for matching rows in another

This tutorial shows you how to go from this dataset:

field | value |
---|---|

A | 1 |

A | 2 |

B | 3 |

C | 4 |

C | 5 |

to

field | value |
---|---|

A | 1;2 |

B | 3 |

C | 4;5 |

in VisiData.

- On the
**value**column, type`+`

followed by`list`

to add to it a list aggregator. - Press
`Shift+F`

on the**field**column to open the**Frequency table**. - Type
`=`

followed by`';'.join(value_list)`

to add an expression column with the joined values.