|
7 | 7 |
|
8 | 8 | Convert CSV files into a SQLite database. Browse and publish that SQLite database with [Datasette](https://github.com/simonw/datasette). |
9 | 9 |
|
10 | | -Basic usage: |
11 | | - |
12 | | - csvs-to-sqlite myfile.csv mydatabase.db |
| 10 | +> [!NOTE] |
| 11 | +> This tool is **infrequently maintained**. I suggest [using sqlite-utils](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data) for importing CSV and TSV to SQLite instead for most cases. |
13 | 12 |
|
| 13 | +Basic usage: |
| 14 | +```bash |
| 15 | +csvs-to-sqlite myfile.csv mydatabase.db |
| 16 | +``` |
14 | 17 | This will create a new SQLite database called `mydatabase.db` containing a |
15 | 18 | single table, `myfile`, containing the CSV content. |
16 | 19 |
|
17 | 20 | You can provide multiple CSV files: |
18 | | - |
19 | | - csvs-to-sqlite one.csv two.csv bundle.db |
20 | | - |
| 21 | +``` |
| 22 | +csvs-to-sqlite one.csv two.csv bundle.db |
| 23 | +``` |
21 | 24 | The `bundle.db` database will contain two tables, `one` and `two`. |
22 | 25 |
|
23 | 26 | This means you can use wildcards: |
24 | | - |
25 | | - csvs-to-sqlite ~/Downloads/*.csv my-downloads.db |
26 | | - |
| 27 | +```bash |
| 28 | +csvs-to-sqlite ~/Downloads/*.csv my-downloads.db |
| 29 | +``` |
27 | 30 | If you pass a path to one or more directories, the script will recursively |
28 | 31 | search those directories for CSV files and create tables for each one. |
29 | | - |
30 | | - csvs-to-sqlite ~/path/to/directory all-my-csvs.db |
31 | | - |
| 32 | +```bash |
| 33 | +csvs-to-sqlite ~/path/to/directory all-my-csvs.db |
| 34 | +``` |
32 | 35 | ## Handling TSV (tab-separated values) |
33 | 36 |
|
34 | 37 | You can use the `-s` option to specify a different delimiter. If you want |
35 | 38 | to use a tab character you'll need to apply shell escaping like so: |
36 | | - |
37 | | - csvs-to-sqlite my-file.tsv my-file.db -s $'\t' |
38 | | - |
| 39 | +```bash |
| 40 | +csvs-to-sqlite my-file.tsv my-file.db -s $'\t' |
| 41 | +``` |
39 | 42 | ## Refactoring columns into separate lookup tables |
40 | 43 |
|
41 | 44 | Let's say you have a CSV file that looks like this: |
42 | | - |
43 | | - county,precinct,office,district,party,candidate,votes |
44 | | - Clark,1,President,,REP,John R. Kasich,5 |
45 | | - Clark,2,President,,REP,John R. Kasich,0 |
46 | | - Clark,3,President,,REP,John R. Kasich,7 |
47 | | - |
| 45 | +```csv |
| 46 | +county,precinct,office,district,party,candidate,votes |
| 47 | +Clark,1,President,,REP,John R. Kasich,5 |
| 48 | +Clark,2,President,,REP,John R. Kasich,0 |
| 49 | +Clark,3,President,,REP,John R. Kasich,7 |
| 50 | +``` |
48 | 51 | ([Real example taken from the Open Elections project](https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv)) |
49 | 52 |
|
50 | 53 | You can now convert selected columns into separate lookup tables using the new |
51 | 54 | `--extract-column` option (shortname: `-c`) - for example: |
52 | | - |
53 | | - csvs-to-sqlite openelections-data-*/*.csv \ |
54 | | - -c county:County:name \ |
55 | | - -c precinct:Precinct:name \ |
56 | | - -c office -c district -c party -c candidate \ |
57 | | - openelections.db |
58 | | - |
| 55 | +```bash |
| 56 | +csvs-to-sqlite openelections-data-*/*.csv \ |
| 57 | + -c county:County:name \ |
| 58 | + -c precinct:Precinct:name \ |
| 59 | + -c office -c district -c party -c candidate \ |
| 60 | + openelections.db |
| 61 | +``` |
59 | 62 | The format is as follows: |
60 | | - |
61 | | - column_name:optional_table_name:optional_table_value_column_name |
62 | | - |
| 63 | +```bash |
| 64 | +column_name:optional_table_name:optional_table_value_column_name |
| 65 | +``` |
63 | 66 | If you just specify the column name e.g. `-c office`, the following table will |
64 | 67 | be created: |
65 | | - |
66 | | - CREATE TABLE "office" ( |
67 | | - "id" INTEGER PRIMARY KEY, |
68 | | - "value" TEXT |
69 | | - ); |
70 | | - |
| 68 | +```sql |
| 69 | +CREATE TABLE "office" ( |
| 70 | + "id" INTEGER PRIMARY KEY, |
| 71 | + "value" TEXT |
| 72 | +); |
| 73 | +``` |
71 | 74 | If you specify all three options, e.g. `-c precinct:Precinct:name` the table |
72 | 75 | will look like this: |
73 | | - |
74 | | - CREATE TABLE "Precinct" ( |
75 | | - "id" INTEGER PRIMARY KEY, |
76 | | - "name" TEXT |
77 | | - ); |
78 | | - |
| 76 | +```sql |
| 77 | +CREATE TABLE "Precinct" ( |
| 78 | + "id" INTEGER PRIMARY KEY, |
| 79 | + "name" TEXT |
| 80 | +); |
| 81 | +``` |
79 | 82 | The original tables will be created like this: |
80 | | - |
81 | | - CREATE TABLE "ca__primary__san_francisco__precinct" ( |
82 | | - "county" INTEGER, |
83 | | - "precinct" INTEGER, |
84 | | - "office" INTEGER, |
85 | | - "district" INTEGER, |
86 | | - "party" INTEGER, |
87 | | - "candidate" INTEGER, |
88 | | - "votes" INTEGER, |
89 | | - FOREIGN KEY (county) REFERENCES County(id), |
90 | | - FOREIGN KEY (party) REFERENCES party(id), |
91 | | - FOREIGN KEY (precinct) REFERENCES Precinct(id), |
92 | | - FOREIGN KEY (office) REFERENCES office(id), |
93 | | - FOREIGN KEY (candidate) REFERENCES candidate(id) |
94 | | - ); |
95 | | - |
| 83 | +```sql |
| 84 | +CREATE TABLE "ca__primary__san_francisco__precinct" ( |
| 85 | + "county" INTEGER, |
| 86 | + "precinct" INTEGER, |
| 87 | + "office" INTEGER, |
| 88 | + "district" INTEGER, |
| 89 | + "party" INTEGER, |
| 90 | + "candidate" INTEGER, |
| 91 | + "votes" INTEGER, |
| 92 | + FOREIGN KEY (county) REFERENCES County(id), |
| 93 | + FOREIGN KEY (party) REFERENCES party(id), |
| 94 | + FOREIGN KEY (precinct) REFERENCES Precinct(id), |
| 95 | + FOREIGN KEY (office) REFERENCES office(id), |
| 96 | + FOREIGN KEY (candidate) REFERENCES candidate(id) |
| 97 | +); |
| 98 | +``` |
96 | 99 | They will be populated with IDs that reference the new derived tables. |
97 | 100 |
|
98 | 101 | ## Installation |
99 | 102 |
|
100 | | - $ pip install csvs-to-sqlite |
| 103 | +```bash |
| 104 | +pip install csvs-to-sqlite |
| 105 | +``` |
101 | 106 |
|
102 | 107 | `csvs-to-sqlite` now requires Python 3. If you are running Python 2 you can install the last version to support Python 2: |
103 | | - |
104 | | - $ pip install csvs-to-sqlite==0.9.2 |
| 108 | +```bash |
| 109 | +pip install csvs-to-sqlite==0.9.2 |
| 110 | +``` |
105 | 111 |
|
106 | 112 | ## csvs-to-sqlite --help |
107 | 113 |
|
|
0 commit comments