Recipe 2 Adding or Modifying Drill ‘Formats’
2.1 Problem
You have data in a somewhat different format than is handled by the default configuration of Drill’s default/built-in storage formats
and want Drill to be able to work with it.
2.2 Solution
Create a Drill “format
”.
2.3 Discussion
The most common situations for this need is that you either have a CSV-like delimited plain text in a “flat” file that uses different delimiters or comment markers or need to map a different extension to be recognized as a plain text, delimited “flat” file.
Drill’s dfs
storage plugin (http://localhost:8047/storage/dfs) has a “formats
” section below the “workspaces
” section. You an customize a new plain text/flat file format there.
Say you have an evil flat file (esv
?) with @
as a delimeter and !
as a comment character and no column names and also gives special data-meaning to “"
” characters (so we can’t treat them as field quotes).
"esv": {
"type": "text",
"extensions": [
"esv" // to handle our custom file extension
],
"quote": "\u0000", // to handle " being data and never a field quote
"skipFirstLine": true, // to handle no column names
"delimiter": "@", // to handle the custom delimiter
"comment": "!" // to handle ! being a comment character
},
You can find the complete set of parameters in the Drill manual.