0

So I am trying to import some data into postgresql using the COPY command.

Here is a sample of what the data looks like:

"UNIQ_ID","SP_grd1","SACN_grd1","BIOME_grd1","Meso_grd1","DM_grd1","VEG_grd1","lcov90_alb","WMA_grd1"
"G01_00000002","199058001.00000","1.00000","6.00000","24889.00000","2.00000","381.00000","33.00000","9.00000"
"G01_00000008","*********************","1.00000","*********************","24889.00000","2.00000","*********************","34.00000","*********************"

the issue that I am having is the double quotes that are wrapping the ********************* which are the null values.

I am using the following in order to create the data table and copy the data:

CREATE TABLE bravo.G01(UNIQ_ID character varying(18), SP_grd1 double precision ,SACN_grd1 numeric,BIOME_grd1 numeric,Meso_grd1 double precision,DM_grd1 numeric,VEG_grd1 numeric,lcov90_alb numeric,WMA_grd1 numeric);

COPY bravo.g01(UNIQ_ID,SP_grd1,SACN_grd1,BIOME_grd1,Meso_grd1,DM_grd1,VEG_grd1,lcov90_alb,WMA_grd1) FROM 'F:\GreenBook-Backup\LUdatacube_20171206\CSV_Data_bravo\G01.csv' DELIMITER ',' NUll AS '*********************' CSV HEADER ;

the create table command works fine but I encounter an error with the NULL AS statement. If I edit the text file and remove the double quotes then the import works fine.

I assume that as CSVs with double quotes and null values are very common there must be a work around here that I am missing. I certainly don't want to go and edit each of my CSVs so that it doesn't have double quotes!

2 Answers 2

3

You might want to try adding FORCE_NULL( column_name [, ...] ) option.

As the documentation stated for FORCE_NULL:

Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to NULL. In the default case where the null string is empty, this converts a quoted empty string into NULL. This option is allowed only in COPY FROM, and only when using CSV format.

The option available from Postgres 9.4: https://www.postgresql.org/docs/10/static/sql-copy.html

Sign up to request clarification or add additional context in comments.

Comments

0

If you're on a unix-like platform, you could use sed to replace the null-strings with something postgresql will recognize automatically as null. On windows, powershell exposes similar functionality.

This approach is more general if you need to perform other types of clean up on the data before loading.

The regex pattern to match your null-string is "[\*]*"

cleaning the file with sed:

[unix]>sed 's/"[\*]*"//g' test.csv > test2.csv

cleaning the file with windows powershell:

[windows-powershell]>cat test.csv | %{$_ -replace '"[\*]*"', ""} > test2.csv

loading into postgresql can then be shorter.:

psql>\copy bravo.g01 FROM 'test2.csv' WITH CSV HEADER;

1 Comment

Thank you. the FORCE NULL worked well. but your technique is going to be really useful for my other data inputs.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.