Bulk Load Tsv File Into SQLite Database
The Tab-Separeted Values (TSV) file is a simple text format and widely supported. The data are stored in a tabular structure, each record in the table is one line of the text file. And each field value of a record is separated by a tab character.
It is easy to operate TSV files in programming languages. But if the file were quite large, for instance, millions of lines, it would be difficult to operate the file. Under this circumstances, load the large TSV file into an SQLite3 databasea would be convenient for further operations.
For example, there is a TSV file, my.tsv:
Name Address Postcode
John Califorlia 90001
Alice New York 10001
Bob Texas 75001
The SQLite3 database file to be generated is my.db.
$ sqlite3 my.db
sqlite>
Then load my.tsv as:
sqlite> .mode tabs
sqlite> .import my.tsv people
Note:
-
Here
people
is the table to be created to store the contents of file my.tsv. -
You might have noticed the there is a header line of the TSV file my.tsv. If there were no header line in source file, you have to create a table according to the contents of the file.
sqlite> CREATE TABLE people ( name TEXT, address TEXT, postcode NUMBER );
Do not try to load a large TSV file into a database right now, because the SQLite3 default options are not efficiently for bulk data operation. Here we need to invoke SQLite3 PRAGMA statements first:
sqlite> PRAGMA journal_mode=OFF;
sqlite> PRAGMA synchronous=OFF;
sqlite> PRAGMA locking_mode=EXCLUSIVE;
sqlite> PRAGMA temp_store=MEMORY;
sqlite> PRAGMA cache_size=1000000;
Ok, how about CSV (comma-separated values) file? It is similar:
sqlite> .mode csv
sqlite> .import my.csv people
Or omit the .mode command:
sqlite> .import --csv my.csv people
Btw, you can skip the initial row of input file by using the option --skip 1
.