Velvet Star Monitor

Standout celebrity highlights with iconic style.

updates

Simplest import of pipe-delimited text w/header to PostgreSQL

Writer Andrew Henderson

I am trying to create a database with several tables. One of the tables is intended to hold about 1.3 million rows of data from the U.S. Economic Census. The data is in a pipe-delimited text file. This is nearly my first effort to use PostgreSQL.

I was hoping to use code similar to that below to simply import everything as text. (I have tentatively decided to treat everything as character, because the values that should be numeric all include alphabetic codes for missing values and such). However, the COPY documentation says that the HEADER option is available only for import of CSV files.

My ultimate aim is to use PostgreSQL to create stripped-down versions of this data for analysis in R. But R can choke on big files, so I was hoping to do all my pre-processing in PostgreSQL, rather than requiring some third tool. I am looking for the way of doing this that requires the least prior knowledge about and analysis of the file I am importing.

Is there another way to do this using PostgreSQL, or do I need to strip the first row off using some other tool?

If I can not use HEADER, I am assuming that I need to provide column names in the CREATE TABLE command. Is this correct?

Also, in such cases, does PostgreSQL apply a default data type, or attempt to determine the data type for each column, or what? Alternatively, can I set a default data type?

I am running PostgreSQL 9.3.4 under Windows 7 64-bit with SP1.

CREATE DATABASE employ;
CREATE TABLE employ.ec0700a1;
COPY EC0700A1 FROM 'C:\\Users\\andrewH\\Documents\\OaklandTechEmploymentProject\\Economic Census 2007\\EC07_6-dig_AllGeo\\EC0700A1.dat' WITH DELIMITER '|', HEADER TRUE;

1 Answer

It sounds like CSV should work. TEXT and CSV formats are actually very similar. The difference is mainly in how quotes and escapes are interpreted and how nulls are handled. See the docs for a more exact description.

I'd just try:

COPY EC0700A1 FROM 'C:\\Users\\andrewH\\Documents\\OaklandTechEmploymentProject\\Economic Census 2007\\EC07_6-dig_AllGeo\\EC0700A1.dat' WITH FORMAT CSV, DELIMITER '|', HEADER TRUE;

And see if it works. If there are other issues check that doc page for solutions, or of course, ask another question.

Oh and yes you do need column names and data types for the CREATE TABLE command. Again see the docs for that. You need to create your table before trying to import data into it.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy