0

My SQL statement in MySQL 5.7.44, Windows 10:

drop table if exists test;
create table test (col1 VARCHAR(64), col2 VARCHAR(80) );
LOAD DATA LOCAL INFILE "t.csv" INTO TABLE test FIELDS OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' COLUMNS TERMINATED BY ';' LINES TERMINATED BY '\r\n' ;

The data:

E605;
E507;"A string
spanning several lines "some Umlauts in ""üöä""  to show""
escaping
last line"
E600;"once
again"

As hexdump:

00000000  45 36 30 35 3b 0d 0a 45  35 30 37 3b 22 41 20 73  |E605;..E507;"A s|
00000010  74 72 69 6e 67 0d 0a 73  70 61 6e 6e 69 6e 67 20  |tring..spanning |
00000020  73 65 76 65 72 61 6c 20  6c 69 6e 65 73 20 22 22  |several lines ""|
00000030  73 6f 6d 65 20 55 6d 6c  61 75 74 73 20 69 6e 20  |some Umlauts in |
00000040  22 22 c3 bc c3 b6 c3 a4  22 22 20 20 74 6f 20 73  |""......""  to s|
00000050  68 6f 77 22 22 0d 0a 65  73 63 61 70 69 6e 67 22  |how""..escaping"|
00000060  22 0d 0a 6c 61 73 74 20  6c 69 6e 65 22 0d 0a 45  |"..last line"..E|
00000070  36 30 30 3b 22 6f 6e 63  65 0d 0a 61 67 61 69 6e  |600;"once..again|
00000080  22 0d 0a                                          |"..|
00000083

The error occurs when I add the FIELDS related keywords FIELDS OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"'. Intention is to have VARCHARs either without " and VARCHARs framed in ". But the outer framing "s should not appear in the data.

2
  • 2
    I don't think you can have both FIELDS and COLUMNS see dev.mysql.com/doc/refman/8.4/en/load-data.html Commented Oct 5 at 16:39
  • @Krischu Do not include solution to question please (post a separate answer instead). Commented Oct 11 at 6:20

1 Answer 1

1

I tested with MySQL 9.4.0 and I could not get your input file to load as you show it. The problem is that the line terminator \r\n is not distinct from the literal \r\n that occurs within a string, so it can't tell when the multi-line string ends.

The solution I used was to add a literal ; character to the real end of each string, like this:

E507;^M"A string^M
spanning several lines "some Umlauts in "üöä" to show"^M
escaping^M
last line";^M
E600;"once^M
again";^M

Now I can instruct LOAD DATA INFILE to use ;\r\n as the line terminator:

LOAD DATA LOCAL INFILE "s.csv" INTO TABLE test
FIELDS 
  TERMINATED BY ';'
  OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY ';\r\n'

Note that I had to use one FIELDS clause. You used FIELDS ... followed by COLUMNS .... FIELDS and COLUMNS are synonyms in this statement, but you can't have two such clauses.

Now to trim the enclosing double-quotes, since you have a spurious ^M in your text, I had to set the text to a temporary variable, and use TRIM().

LOAD DATA LOCAL INFILE "s.csv" INTO TABLE test
FIELDS 
  TERMINATED BY ';'
  OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY ';\r\n'
(col1, @col2)
SET col2 = TRIM('"' FROM TRIM('\r' FROM @col2));

If you can eliminate the spurious ^M, it works without that trimming step.

In other words, make the input like this:

E507;"A string^M
spanning several lines "some Umlauts in "üöä" to show"^M
escaping^M
last line";^M
E600;"once^M
again";^M

And make the load statement like this:

LOAD DATA LOCAL INFILE "s.csv" INTO TABLE test
FIELDS 
  TERMINATED BY ';'
  OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY ';\r\n';

Then the data loads properly. I can query and show this output (using the vertical query output, because ^M causes cursor movement that makes it hard to view the result):

mysql> select * from test\G
*************************** 1. row ***************************
col1: E507
col2: A string
spanning several lines "some Umlauts in "üöä" to show"
escaping
last line
*************************** 2. row ***************************
col1: E600
col2: once
again
2 rows in set (0.001 sec)

If you can't edit the input file, then you will need to write your own code to parse the input file in a custom way, because LOAD DATA INFILE won't be able to do it.

Sign up to request clarification or add additional context in comments.

2 Comments

I must admit that theinput data was wrong. Let me correct it in the original post, please and see then.
Thanks again for the scrutiny. Actually the (even wrongly formatted) input suggests to expect three columns (the first one with an empty col2).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.