I am new to regular expressions in java. I have a csv file which consists of newline characters in some of the fields like below:
name,address,phone
tom,123 baker st,1234
jim,"234 baker st
some city",5678
james,"897 lowell st
some city, some state",78910
If a particular value has commas or newlines, the whole value is enclosed between " ". I need to remove the newline characters (and replace it with a single space) in the fields and I think using a regex would be easier.
hoping it would make it easier, I have read the whole file into a String using the below lines:
String str = new String(Files.readAllBytes(Paths.get("file path")),"UTF-8");
Now I have the whole file in str. All the fields are separated by commas. so, any newline characters between ," and ", in the string str should be removed (replaced with " ").I am guessing I should write a regex to match this pattern and then replace the newlines('\n') with " ".
My knowledge ends there and I have no clue how to implement it in my code.
after the transformation, the data should look like this:
name,address,phone
tom,123 baker st,1234
jim,"234 baker st some city",5678
james,"897 lowell st some city, some state",78910
Any help would be appreciated! Thank you.
,and enclosed by"newlinecharacters in the fields like I mentioned in my question. Can it be done using the parser? If yes, can you please link an example?CSVParserand replace\r\nwith empty string for the fields you want to remove new line"") and these don't terminate the value. Saravana's suggestion is much better.