4.2. Create a SQLite database from CSV/Excel/JSON/SQLite/etc. files¶
sqlitebiter file is a subcommand to convert tabular data file(s) to a SQLite database file.
4.2.1. Examples¶
You can use wildcards to convert multiple files into a database file. glob?
File formats are automatically detected from the extensions.
- Example:
$ ls sample_data.csv sample_data.xlsx sample_data_multi.json sample_data_single.json $ sqlitebiter -o sample.sqlite file * [INFO] sqlitebiter file: convert 'sample_data.csv' to 'sample_data' table [INFO] sqlitebiter file: convert 'sample_data_multi.json' to 'table_a' table [INFO] sqlitebiter file: convert 'sample_data_multi.json' to 'table_b' table [INFO] sqlitebiter file: convert 'sample_data_single.json' to 'sample_data_single' table [INFO] sqlitebiter file: convert 'sample_data.xlsx' to 'samplesheet1' table [INFO] sqlitebiter file: convert 'sample_data.xlsx' to 'samplesheet3' table [INFO] sqlitebiter file: converted results: source=4, success=6, created-table=6 [INFO] sqlitebiter file: database path: sample.sqlite
- Output:
$ sqlite3 sample.sqlite .schema CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER); CREATE TABLE IF NOT EXISTS 'sample_data' ("attr_a" INTEGER, "attr_b" REAL, "attr_c" TEXT); CREATE TABLE IF NOT EXISTS 'table_a' ("attr_a" INTEGER, "attr_b" REAL, "attr_c" TEXT); CREATE TABLE IF NOT EXISTS 'table_b' (a INTEGER, b REAL); CREATE TABLE IF NOT EXISTS 'sample_data_single' ("attr_a" INTEGER, "attr_b" REAL, "attr_c" TEXT); CREATE TABLE IF NOT EXISTS 'samplesheet1' (a INTEGER, b REAL, c TEXT); CREATE TABLE IF NOT EXISTS 'samplesheet3' (aa INTEGER, ab TEXT, ac TEXT);
Designate multiple file paths to convert:
- Example:
$ sqlitebiter file sample_data.csv sample_data.xlsx [INFO] sqlitebiter file: convert 'sample_data.csv' to 'sample_data' table [INFO] sqlitebiter file: convert 'sample_data.xlsx' to 'samplesheet1' table [INFO] sqlitebiter file: convert 'sample_data.xlsx' to 'samplesheet3' table [INFO] sqlitebiter file: converted results: source=2, success=3, created-table=3 [INFO] sqlitebiter file: database path: out.sqlite
Note
See also: Limitations of JSON conversion
Wildcard characters cannot use in Windows environments
You could explicitly specify converting file format with the --format option
- Example:
$ sqlitebiter file --format csv sample_data [INFO] sqlitebiter file: convert 'sample_data.csv' to 'sample_data' table [INFO] sqlitebiter file: converted results: source=1, success=1, created-table=1 [INFO] sqlitebiter file: database path: out.sqlite
4.2.2. sqlitebiter file subcommand help¶
Usage: sqlitebiter file [OPTIONS] [FILES]...
Convert tabular data within CSV/Excel/HTML/JSON/Jupyter
Notebook/LDJSON/LTSV/Markdown/Mediawiki/SQLite/SSV/TSV file(s) or named
pipes to a SQLite database file.
Options:
-r, --recursive Read all files under each directory,
recursively.
--pattern PATTERN Convert files matching PATTERN.
--exclude PATTERN Exclude files matching PATTERN.
--follow-symlinks Follow symlinks.
-f, --format [csv|excel|html|json|json_lines|jsonl|ldjson|ltsv|markdown|mediawiki|ndjson|sqlite|ssv|tsv|ipynb]
Data format to loading (auto-detect from
file extensions in default).
--encoding ENCODING Encoding to load files. Auto-detection from
files in default.
-h, --help Show this message and exit.
Documentation: https://sqlitebiter.rtfd.io/ Issue tracker:
https://github.com/thombashi/sqlitebiter/issues
4.2.3. Supported data formats¶
The following table shows the supported data formats:
Format |
File Extension |
Remarks |
|---|---|---|
CSV |
|
|
Excel |
|
Create a table for each sheet in the Excel workbook. |
HTML |
|
Scrape tabular data from
<table> tags in the HTML file.And create table for each
<table> tag data. |
JSON |
|
|
Jupyter Notebook |
|
|
Line-delimited JSON |
|
|
LTSV |
|
|
Markdown |
|
Extract tabular data in the Markdown file.
And create a table for each
<table> tabular data. |
SQLite |
|
|
TSV |
|
4.2.4. Table naming convention¶
Table name automatically decided as follows for each format:
Format
Table Name
CSV
<filename>Excel
<Sheet name>HTML
<title>_<key>.<title>replaced with thetitletag of the page.<key>replaced with: (1)idattribute of the table tag.(2) unique string ifidattribute not present in the table tag.JSON
LTSV
<filename>Markdown
<filename>TSV
<filename>
- Common behavior
<filename>replaced with the filename of converting file (without extensions)e.g. If the input file name is
sample.csv,<filename>issample
If a created table name already exists in the database, the behavior differs depending on
the existing table (after this referred to as A) and create table (after this referred to as B) structure:
AandBhas the same table name and table structureAppend creating table data to the existing table data
AandBhas the same table name, but the different table structuresqlitebitertry to create unique table name forBby appending suffix id number