0

I have a problem with importing tab delimited data from csv file because of double quote appearing in data e.g:

→Voice"Mail→

I am importing data using import option in pgadmin III. I specified tab as delimiter and also tried with options QUOTE and/or ESCAPE. None of this worked. I know it is issue with double quote, because I removed it from file and import succeeded. I also know this issue has already been rised (Is it possible to turn off quote processing in the Postgres COPY command with CSV format?) but I cannot use option COPY <tablename> FROM <filename> because I am importing data to remote DB and relative path to file on my PC is not accepted. I want to avoid modifying input file because it might be of huge size.

1 Answer 1

5

If you want to preserve the double quotes, set the QUOTE to something else (i would use a character that does not exist in your data file).

example: (tested on postgresql 9.6)

create a test table

CREATE TABLE dialogue (person TEXT, dialogue TEXT);

create a test data file (tab delimited) with the following sample data.

# dialogue.txt
jim I ran into your ex. He says "hi"
rachel  did he now? well tell him i said "don't call me"

execute the following command in psql

\copy dialogue FROM '/path/to/dialogue.txt' WITH CSV QUOTE '$' DELIMITER E'\t';

example output:

etl_db=# \copy dialogue from '~/Desktop/dialogue.txt' WITH CSV DELIMITER E'\t' QUOTE '$';
COPY 2
etl_db=# select * from dialogue;
 person |                     dialogue
--------+--------------------------------------------------
 jim    | I ran into your ex. He says "hi"
 rachel | did he now? well tell him i said "don't call me"
(2 rows)

I am importing data to remote DB and relative path to file on my PC is not accepted. I want to avoid modifying input file because it might be of huge size.

use the psql command line client for postgresql. It supports the \copy meta command that wraps around the sql command COPY and allows you to stream records from a local machine to the server.

I have tried with '|' as QUOTE because my data has lots of special characters like: %$^&*# I got this error: ERROR: character with byte sequence 0x8f in encoding "WIN1252" has no equivalent in encoding UTF8. My system locale is: Polish (Poland)

the COPY command has an ENCODING option. You could use that to specify that the file is encoded in utf8 or another encoding.

Sign up to request clarification or add additional context in comments.

7 Comments

Just to be sure I understand your comment. My data is not quoted (double or single) but tab delimited and it can have single or double quote matched or unmatched. You suggest to use WITH CSV QUOTE '$' which tell postgres to look for '$' as quotation, and since it is not in data, it will only use tab to separate data?
Yes that's right. Did you try the above solution and did it work or throw an error?
I have tried with '|' as QUOTE because my data has lots of special characters like: %$^&*# I got this error: ERROR: character with byte sequence 0x8f in encoding "WIN1252" has no equivalent in encoding UTF8
Indeed it doesn't. What is your operating system locale? I couldn't find any encoding where | would be 0x8F.
My system locale is: Polish (Poland)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.