CSV.jl Documentation
High-level interface
#
CSV.read
— Function.
CSV.read(fullpath::Union{AbstractString,IO}, sink=DataFrame, args...; kwargs...)
=> typeof(sink)
parses a delimited file into a Julia structure (a DataFrame by default, but any Data.Sink
may be given).
Positional arguments:
fullpath
; can be a file name (string) or otherIO
instancesink
; aDataFrame
by default, but may also be otherData.Sink
types that support streaming viaData.Field
interface
Keyword Arguments:
delim::Union{Char,UInt8}
; how fields in the file are delimitedquotechar::Union{Char,UInt8}
; the character that indicates a quoted field that may contain thedelim
or newlinesescapechar::Union{Char,UInt8}
; the character that escapes aquotechar
in a quoted fieldnull::String
; an ascii string that indicates how NULL values are represented in the datasetheader
; column names can be provided manually as a complete Vector{String}, or as an Int/Range which indicates the row/rows that contain the column namesdatarow::Int
; specifies the row on which the actual data starts in the file; by default, the data is expected on the next row after the header row(s)types
; column types can be provided manually as a complete Vector{DataType}, or in a Dict to reference a column by name or numbernullable::Bool
; indicates whether values can be nullable or not;true
by default. If set tofalse
and missing values are encountered, aNullException
will be throwndateformat::Union{AbstractString,Dates.DateFormat}
; how all dates/datetimes are represented in the datasetfooterskip::Int
; indicates the number of rows to skip at the end of the filerows_for_type_detect::Int=100
; indicates how many rows should be read to infer the types of columnsrows::Int
; indicates the total number of rows to read from the file; by default the file is pre-parsed to count the # of rowsuse_mmap::Bool=true
; whether the underlying file will be mmapped or not while parsing
Note by default, "string" or text columns will be parsed as the WeakRefString
type. This is a custom type that only stores a pointer to the actual byte data + the number of bytes. To convert a String
to a standard Julia string type, just call string(::WeakRefString)
, this also works on an entire column string(::NullableVector{WeakRefString})
. Oftentimes, however, it can be convenient to work with WeakRefStrings
depending on the ultimate use, such as transfering the data directly to another system and avoiding all the intermediate byte copying.
Example usage:
julia> dt = CSV.read("bids.csv")
7656334×9 DataFrames.DataFrame
│ Row │ bid_id │ bidder_id │ auction │ merchandise │ device │
├─────────┼─────────┼─────────────────────────────────────────┼─────────┼──────────────────┼─────────────┤
│ 1 │ 0 │ "8dac2b259fd1c6d1120e519fb1ac14fbqvax8" │ "ewmzr" │ "jewelry" │ "phone0" │
│ 2 │ 1 │ "668d393e858e8126275433046bbd35c6tywop" │ "aeqok" │ "furniture" │ "phone1" │
│ 3 │ 2 │ "aa5f360084278b35d746fa6af3a7a1a5ra3xe" │ "wa00e" │ "home goods" │ "phone2" │
...
#
CSV.write
— Function.
write a source::Data.Source
out to a CSV.Sink
io::Union{String,IO}
; a filename (String) orIO
type to write thesource
tosource
; aData.Source
typedelim::Union{Char,UInt8}
; how fields in the file will be delimitedquotechar::Union{Char,UInt8}
; the character that indicates a quoted field that may contain thedelim
or newlinesescapechar::Union{Char,UInt8}
; the character that escapes aquotechar
in a quoted fieldnull::String
; the ascii string that indicates how NULL values will be represented in the datasetdateformat
; how dates/datetimes will be represented in the datasetquotefields::Bool
; whether all fields should be quoted or notheader::Bool
; whether to write out the column names fromsource
append::Bool
; start writing data at the end ofio
; by default,io
will be reset to its beginning before writing
Lower-level utilities
CSV.Source
CSV.Sink
CSV.Options
CSV.parsefield
CSV.readline(::CSV.Source)
CSV.readsplitline
CSV.countlines(::CSV.Source)