Columns
How to manipulate columns
Commands(s) | Operation |
---|---|
! |
pins the current column on the left as a key column |
H L |
slides the current column one position to the left/right |
gH gL |
slides the current column all the way to the left/right of its section |
How to hide (remove) and unhide (return) columns
How to hide columns
- Press
-
(hyphen) to hide the current column.
or
- Press
Shift+C
on the source sheet to open its Columns sheet. - Move the cursor right to the width column.
- Move the cursor down to the row which represents the column you wish to hide.
- Press
e
followed by0
to set the width for that column to 0. - Press
q
to return to the source sheet.
How to unhide columns
- Press
gv
to unhide all columns on current sheet.
or
- Press
Shift+C
on the source sheet to open its Columns sheet. - Move the cursor right to the width column.
- Move the cursor down to the row which represents the column you wish to unhide. Currently, that cell should contain the value 0.
- Press
e
followed by a positive number to set the width. - Press
q
to return to the source sheet.
How to specify column types
Command | Type |
---|---|
~ |
string |
# |
int |
% |
float |
$ |
currency |
@ |
date |
z# |
vlen |
z~ |
anytype |
Columns usually begin as untyped (anytype
). Errors when working with numerical or datetime data is often due to values being considered as strings, and the problem is solved by setting the correct type.
The float
type uses Python's builtin float()
constructor to parse the string, and it parses by using the decimal separator.
The currency
type is a bit of a misnomer. It filters out any non-numeric characters, and then parses the remainder of the cell value as a float. The reasons to prefer using float
over currency
, is performance (it is quite a bit slower than native parsing with float
), or if any non-float characters should get reported as an error.
The date
type parses dates into ISO8601 format. Those columns can then be used in mathematical calculations, and the calculations are interpreted for dates. E.g. 2020-01-01 + 1, is 2020-01-02.
The vlen
type formats the cell value to the length of the content. For example, if the cell content is a list of length 3, then when vlen
typed it will display a value of 3.
There is also the floatlocale
type, which uses Python's locale.atof
to parse the Column values. With floatlocale
, you can set the LC_NUMERIC
environment variable appropriately (before launching VisiData), such that atof()
will parse the number based on your locale setting. There is a type-floatlocale
command, which is unbound by default, because parsing this way is significantly slower than using the builtin float type.
If you need locale-specific float parsing regularly, you may want to rebind %
or z%
(or maybe some other keystroke) to type-floatlocale
instead.
The following example uses the file sample.tsv.
How to batch specify column types for more than one column
- Press
Shift+C
to open the Columns sheet. - Press
s
ort
to select the rows referencing the columns you wish to type. - Type
g
followed by the any of the above typing keystrokes to set the type for all selected columns on the source sheet.
How to format columns
Note: Un-typed file formats, like tsvs and csvs, will save as they are displayed.
Some types have an option for their default display formatting.
Type | Option | Default |
---|---|---|
int | dispintfmt | {:.0f} |
float | dispfloatfmt | {:.02f} |
currency | dispcurrencyfmt | %0.2f |
date | dispdatefmt | %Y-%m-%d |
Ways to adjust the display formatting:
- The
fmtstr
column on the Columns Sheet allows you to specify the formatting for specific columns within that session, without affecting the default for the others. - The
disp_TYPE_fmt
option can be changed on the Options Sheet to set the formatting for all columns of typeTYPE
in that session. - The
--disp-TYPE-fmt
argument can be passed through the commandline to set the formatting for all columns of typeTYPE
in that session. - The
options.disp_TYPE_fmt
can be set in the~/.visidatarc
to change the default formatting for all columns of typeTYPE
for all sessions.
There are several formatting styles offered:
- Formatting that starts with
'%'
(e.g. %0.2f) will use locale.format_string(). - Otherwise (e.g. {:.02f}), formatting will be passed to Python's string.format().
- Date fmtstr are passed to strftime.
The default for currency uses locale.format_string()
. The default for int/float/date use string.format()
.
How to format a specific numeric columns to contain a thousands separator within a session?
- Set a column to a numeric type by pressing
#
(int),%
(float), or$
(currency). - Press
Shift+C
to open the Columns Sheet. - Move to the row referencing the column whose display you wish to format. Move the cursor to the fmtstr column.
- Type
e
followed by {:,.0f} for anint
type and {:,.02f} for a floating point type.
How to quickly adjust the precision of a float or date?
- Ensure the column is either typed as a float (
%
), floatsi, currency ($
), or date (@
). - Press
SPACE
and type the longnamessetcol-precision-more
orsetcol-precision-less
to adjust the precision in the current column.
Bind the longnames to keys, if using these commands frequently.
How to set all date columns to be month/day/year.
The default can be set in a ~/.visidatarc
.
options.disp_date_fmt = '%m/%d/%Y'
or passed through the commandline
vd --disp-date-fmt='%m/%d/%Y'
or set in the Options Sheet.
- Press
Shift+O
to open the Options Sheet. - Move the cursor down to the relevant dispdatefmt option.
- Type
e
followed by %m/%d/%Y.
How to specify a comma decimal separator when typing floating point numbers?
- Before launching VisiData, set the shell environment variable
LC_NUMERIC
to a locale which interprets commas as decimals. Any European locale should do; an example that works isen_DK.UTF-8
. - Within VisiData, set a column to type
floatlocale
by pressingSpace
followed by type-floatlocale.
Note that type-floatlocale
is significantly slower than type-float
. However, if you wish to replace the current binding for type-float
with type-floatlocale
, add to your ~/.visidatarc
:
Sheet.unbindkey('%')
Sheet.bindkey('%', 'type-floatlocale')
or if you never use type-floatsi
, you can do
Sheet.unbindkey('z%')
Sheet.bindkey('z%', 'type-floatlocale')
How to split a column
Python regular expressions provide more finetuned column splitting. The following example uses the commands for column splitting and transformation with xd/puzzles.tsv.
:
adds new columns derived from splitting the current column at positions defined by a regex pattern.options.default_sample_size
(default: 100) rows around the cursor will be used to determine the number of columns that will be created.;
adds new columns derived from pulling the contents of the current column which match the regex within capture groups. The new columns are named using the capture group index, or if named capture groups are used, the capture group names. This command uses theoptions.default_sample_size
(default:100) rows around the cursor as sample rows.*
followed by regex/
substring replaces the text which matches the capture groups in regex with the contents of substring. substring may include backreferences (\1 etc).
[How do I substitute text in my column]
The *
command can be used to do content transformations of cells. The g*
variant transforms in-place, instead of creating a new column.
The following example uses benchmarks.csv.
Question Transform the SKU values of food to nutri.
- Move cursor to SKU column.
- Press
gs
to select all rows. - Type
g*
followed by food/nutri.
- tests/transform-cols.vd
How to expand columns that contain nested data
If a column includes container data such as JSON objects or arrays, the (
family of commands can expand the child values into top-level columns:
Command | Operation |
---|---|
( |
expand current column |
g( |
expand all visible columns fully |
z( |
expand current column to a specific depth (prompt for input) |
gz( |
expand all visible columns to a specific depth (prompt for input) |
) |
contract (unexpand) the current column |
zM |
expand current column row-wise within that column |
The following demo shows (
commands applied to this data:
[
[ "short", "array" ],
[ "slightly", "longer", "array" ],
{ "nested": "data" },
{ "more": { "deeply": { "nested": "data" } } }
]
Note that by default the expansion logic will look for nested columns in up to options.default_sample_size
(Default: 100) rows surrounding the cursor. This behavior can be controlled by adjusting default_sample_size
in the Options Sheet, or setting options.default_sample_size
in the ~/.visidatarc
file.
How to create derivative columns
The =
command takes a Python expression as input and creates a new column, where each cell evaluates the expression in the context of its row.
These variables and functions are available in the scope of an expression:
- Column names evaluate to the typed value of the cell in the named column for the same row.
vd
attributes and methods; useCtrl+X vd
to view the vd object, or see the API.Sheet
attributes and methods; useg Ctrl+Y
to view the sheet object (or see the API).- Global functions and variables (add your own in your .visidatarc).
- modules that have been
import
ed in Python- if you need a module that hasn't already been imported at runtime, use
g Ctrl+X import <modname>
.
- if you need a module that hasn't already been imported at runtime, use
sheet
: the current sheet (a TableSheet object)col
: the current column (as a Column object; use for Column metadata)row
: the current row (a Python object of the internal rowtype)- curcol: evaluate to the typed value of this row in the column that the cursor was on at the time that the expression column was added.
- cursorCol: evaluate to the typed value of this row for the column the cursor is on. Changes as the cursor moves for
=
. Uses the column from the time the calculation was made forg=
,gz=
, andz=
.
Additional attributes can be added to sheets and columns.
col
deliberately returns a Column object, but any other Column object is interpreted as the value within that column for the same row. For example, both curcol
and cursorcol
return values, not the object itself.
For example, this customizes addcol-expr to set the curcol
attribute on the new ExprColumn to a snapshot of the current cursor column (at the time the expression column is added):
Sheet.addCommand('=', 'addcol-expr', 'addColumnAtCursor(ExprColumn(inputExpr("new column expr="), curcol=cursorCol))', 'create new column from Python expression, with column names as variables')
Then, an expression can use curcol
as though it referred to the value in the saved column.
Tab
autocompletion when inputting an expression will cycle through valid column names only.
The following examples use the file sample.tsv.
Question On which days have we sold more than 10 Items?
- Scroll to the Units column. Set the type of the Units column by pressing
#
(int). - Type
=
followed byUnits > 10
. A new column will be created. The cells in this column will contain the value True in rows where the number of Units are greater than 10 and False otherwise. - Move to the new derived column.
- Type
|
followed byTrue
to select all rows where there were more than 10 Units sold. - Press
"
to open a duplicate sheet with only those selected rows.
Question I have a dataset with separate columns for Year, Month and Day. How can I concatenate them into a single date column?
- Type
=
followed byYear + '-' + Month + '-' + Day
. - Set the type of the new derived column by pressing
@
(date). - Type
^
followed byDate
to rename the column to Date.
Question I have a dataset with Date column that is missing a prefix of '2020-'. How do I add it to the Date column?
When using =
, and wanting to reference the current column, we recommend using curcol
. When using g=
, gz=
, and z=
, we recommend cursorCol. =
, unlike the others, is dynamic and changes with adjustment of underlying values, which means it will change along with the movement of the cursor (tracked by cursorCol
). curcol
is a special attribute of a new ExprColumn, which remembers the cursorCol at the time of creation.
- Move the cursor to Date.
- Type
g=
followed by f"2020-{cursorCol}".
Question I have a dataset with file names. How do I create a new column with the file names lower cased?
- Move the cursor to file names column.
- Type
=
followed by curcol.casefold(). - Move to the newly created column, and rename it with
^
, followed by the desired name.
How to configure multiple columns
Properties of columns on the source sheet can be changed by using standard editing commands on its Columns sheet (accessed with Shift+C
). In particular, it facilitates the selection of multiple columns, followed by utilising one of the g
-prefixed commands to modify all of them.
For a full list of available commands, see the man page. Some example workflows follow.
The following examples use the file sample.tsv.
How to set multiple statistical aggregators
Question What is the average daily revenue from sales of each Item?
- Set the type of the Units column by pressing
#
(int). - Set the type of the Total column by pressing
%
(float). - Press
Shift+C
to open the Columns sheet. - Press
s
ort
on the rows referencing the source sheet Units column and the Total column to select them. - Type
g+
followed byavg
to add a avg statistical aggregator to the selected rows. - Press
q
to exit and return to the source sheet. - Scroll to the Item column. Press
Shift+F
to open the Frequency table.
Question What are the daily average and sum total number of Units sold for each Item?
Press
Shift+C
to open the Columns sheet.Move the cursor to the row referencing the source sheet Units column.
a. Press
s
ort
to select it.b. Set the type for the source sheet Units columns by pressing
g#
(int).c. Move the cursor to the aggregators column.
d. Type
e
to enter edit mode, followed by sum avg.Press
q
to exit and return to the source sheet.Move the cursor to the Item column. Press
Shift+F
to open the Frequency table.