Added CSV Support

In our latest build, we have added beta support for importing csv delimited files.

You can declare a CSV datasource by using the CSV Keyword in the DATASOURCE command in irdbImport.

The following is an example CSV DATASOURCE command.

DATASOURCE a1=CSV ‘Data Source=C:\irdb\data\csvtest;Delimiter=,;Has Quotes=True;Skip Rows=0;Has Header=True;Comment Prefix=#;Trim Spaces=False;Ignore Empty Lines=True;encoding=default’

By default the encoding is the system default text encoding. Other supported options for the encoding option are utf8,ascii,unicode,bigendianunicode .

To specify a delimiter use the delimiter character e.g. , for a csv file. For a tab delimiter use tab .

By default the system will analyse the first 25,000 and use that to guess a data type for each column.

If you want explicit control over the datatypes that are imported, you need to create a table using the CREATE TABLE command, and then do an uncolumnized import. Columnize the table when done.

Internally we are using the Lumenworks CSV Parsing library by Sebastian Loriens and extended by Paul Hatcher. This has a MIT license. We have wrapped it with a custom DataReader.

Here is an example csv import script in full.

DATASOURCE a1=CSV ‘Data Source=C:\irdb\data\csvtest;Delimiter=,;Has Quotes=True;Skip Rows=0;Has Header=True;Comment Prefix=#;Trim Spaces=False;Ignore Empty Lines=True;encoding=default’
/* Simple CSV Imports, where it autodetects column type */
import [orders] = a1.[orders.txt]
import [customers]= a1.[customers.txt]
import [order details]= a1.[order details.txt]
import [categories]= a1.[categories.txt]
import [products]= a1.[products.txt]
import [Shippers]= a1.[Shippers.txt]
import [Suppliers]= a1.[Suppliers.txt]
import [custom_dates]= a1.[custom_dates.txt]
// This format of the Import statement will override the autodetected datatypes in the next step, with the previously defined types
create table [categories2]
( CategoryID text, categoryName text, description text, picture text )
import [categories2]= uncolumnized a1.[categories.txt]
// This columnizes an uncolumnized table
columnize categories2
/* this example uses Decimal, int, long, date & Strings. Other types are SMALLINT,TINYINT */
create table [orders2] (  OrderID DECIMAL ,  CustomerID STRING, EmployeeID LONG, OrderDate DATE,  RequiredDate DATE,  ShippedDate DATE,  ShipVia INT, Freight STRING, ShipName STRING, ShipAddress STRING, ShipCity STRING,  ShipRegion STRING,  ShipPostalCode STRING, ShipCountry STRING)
import [orders2] = uncolumnized a1.[orders.txt]
columnize orders2
SAVE

 

Discover more from InMemory.Net

Subscribe now to keep reading and get access to the full archive.

Continue reading