10dCSV

Database Management

Mick McQuaid

University of Texas at Austin

11 May 2026

Evil

  • CSV files are the bane of database specialists
  • They are second only to Excel spreadsheets in the frequency with which they are called “evil”
  • But we ingest CSV files (and Excel spreadsheets) into databases all the time
  • The main problem with CSV as a format is that it is not standard
  • There is a standard, RFC-4180, but not all programs respect it
  • Excel is the most popular example of a program that breaks the standard (in part, for vendor lock-in)

What is CSV?

  • Every row contains some values (maybe “text” or “numbers” … who knows) separated by commas
  • Depending on any aspect of it can lead to errors

Sources of error

  • Is there a header row or not?
  • How does each row end: CR, LF, or CRLF? (see Newline)
  • What if there are commas in the data?
  • Some apps, like Excel don’t even require that the separator be a comma!
  • What is the character encoding, UTF-8 (the most common) or some other encoding (again, Excel uses UTF-16)?
  • What if quote marks are used to allow commas in the data—then how are quote marks in the data handled?
  • What happens if not every row has the same number of commas?
  • What if there are line endings in the data?
  • What if the number of rows or cells is too large for the representing program (they all have different maxima)?

Why the heck would anyone use CSV?

  • Human-readable
  • Easy to explain if you ignore the edge cases
  • (Possibly) durable—there will always be programs that can more or less read it

What about SQLite?

  • SQLite can both import and export CSV
  • There are an assortment of options to satisfy the source or target program
  • The options are buried under various sqlite dot commands

Importing CSV into SQLite

  • You can import a CSV file into a table using the sqlite .import command
  • SQLite assumes the first row contains column names unless you give it the --skip N option in which case it skips N rows before finding column names
  • SQLite assumes that the file conforms to RFC-4180 except in certain cases
  • The certain cases include (1) certain uses of the sqlite .mode command or (2) the option --ascii is given
  • There are a number of options to be found at .help import
  • Examine tables after importing and, if they look strange, dive into .help import

Exporting CSV from SQLite

  • To find out the available modes of output, say .mode --list
  • If you issue the sqlite .mode csv command, the following query results will be in CSV format
  • These following query results can be sent to a file by issuing the sqlite .output [OPTIONS] [FILENAME] command
  • See the page at .help output for the OPTIONS but see .help mode for more relevant options such as --quote which enables or disables quoting of text in different modes, including CSV by saying .mode --quote csv csv
  • You can also change the row and column separators via .mode
  • See the instructor if you run into trouble—there are vastly many edge cases and corner cases (confluence of two or more edge cases)

END

References

Colophon

This slideshow was produced using quarto

Fonts are Roboto, Roboto Light, and Victor Mono Nerd Font