Skip to content

Connect VisiData to SQL Databases with vdsql

By Saul Pwanson

I have some big news today.

If you’ve used VisiData, you know it’s a super powerful tool for loading, parsing, and exploring tabular data.

Well, today we’re introducing vdsql, a plugin for VisiData that you can connect directly to your live databases — pretty much any database that speaks SQL!

  • BigQuery
  • ClickHouse
  • DuckDB
  • MySQL
  • PostgreSQL
  • SQLite

You use VisiData in almost exactly the same way as before — the same commands, the same keyboard shortcuts — only instead of loading files into local memory, vdsql translates your commands into SQL queries, and relays those to your database. Those queries run server-side, and the results are sent back to VisiData.

Yes, this means you can explore your live databases — leveraging VisiData’s powerful transformations and aggregations — without writing a line of SQL yourself.

Bigger Data, Faster

Since vdsql’s queries run server-side, you can work with much larger data sets — like your entire data warehouse.

There’s no longer any need to first export an extract for local analysis; just have vdsql to talk directly to your database, like so:

Output

Once connected to your database, explore tables using the normal VisiData interface. When you’re done, you can save out the resulting queries as SQL (and eventually as Substrait or Python).

Requirements

To run vdsql, you’ll first need to install Python 3.8 and pip (the Python package manager). vdsql also depends on recent versions of VisiData and Ibis, but these will be installed automatically when you install vdsql.

Installation

The cutting edge of vdsql development is currently in the VisiData repo. There does exist a PyPI package, but it is very outdated.

git clone git@github.com:saulpw/visidata.git
cd visidata/visidata/apps/vdsql
pip3 install .

Note that only the SQLite and Clickhouse backends are installed by default; to connect to any of the other backends listed above, you’ll need to install the Ibis dependencies for each one separately, using (e.g. for MySQL):

pip install ibis-framework[mysql]

Once installed, you can use vdsql instead of vd whenever you want to query a database directly, instead of loading the database tables into memory first (as vd does with flat files). Remember, vdsql to use a SQL connection; vd for in-memory processing.

Find all the details in the CHANGELOG.

Why two different tools?

Why are we introducing vdsql (vdsql) as a separate tool from VisiData (vd)? Why not integrate this new, fancy, whiz-bang SQL connectivity directly into VisiData?

Great question.

The short answer is: This is all very, very new, and we want to make sure we do it right.

So, for now, vdsql is its own tool, separate from VisiData, and you have to install vdsql separately. But, that said, vdsql is also a VisiData plugin, so, once vdsql is installed, you can use it from within VisiData core — up to you.

As we continue to develop vdsql, it may become part of the VisiData core, at some point.

Usage

Open a database like…

vdsql foo.sqlite  # or .sqlite3
vdsql mysql://...
vdsql postgres://...
vdsql foo.duckdb # or .ddb
vdsql clickhouse://explorer@play.clickhouse.com:9440/?secure=1
vdsql bigquery:///bigquery-public-data
vdsql <file_or_url>

…where file_or_url is any connection string supported by ibis.connect() or any of the filetypes and options that VisiData itself supports.

Super SQL-ly Sidebar

vdsql screenshot

vdsql uses the VisiData sidebar (introduced in v2.9) to show the SQL query for the current view.

  • To toggle the sidebar on/off, press b.
  • To choose a sidebar option, press zb.
  • To open the sidebar as its own sheet, press gb.

In this way you can compose a SQL expression using VisiData commands, open the SQL sidebar, and save the resulting query to a file (or copy it into your system clipboard buffer).

What’s next for you?

What’s next for vdsql?

First and foremost: We want to know what you want! Submit a feature request on GitHub.

That said, here’s what we think are some important next steps:

  • vdsql is powered by Ibis. While Ibis supports other backends, we haven’t tested them with vdsql yet. So we want to test and confirm support for:
    • Apache Impala
    • Dask
    • Datafusion
    • HeavyAI
    • PySpark
  • Currently, the SQL that vdsql generates via Ibis is not optimized. The next version of Ibis will generate optimized SQL.
  • We’d like to be able to write out the results of the expression as a data file (e.g. parquet or DuckDB).
  • We think it could be really useful to be able to pass in an existing, saved SQL query from the CLI, right when launching VisiData. For example, something like: vdsql <db> --sql startq.sql

Thanks

A special thanks to Phillip Cloud and Anja Kefala for their contributions to the project thus far!

Corporate Sponsors

Sponsor saulpw on Github