1

How would you split this row to a string array?

the problem is Rutois, a.s. , so you cannot directly split with ',' separator..

543472,"36743721","Rutois, a.s.","151","some name","01341",55,"112",1

thanks

3
  • Give an example of the what result you want from the string above. Commented Aug 21, 2010 at 15:08
  • 543472;36743721;Rutois, a.s.;151;some name;01341;55;112;1 Commented Aug 21, 2010 at 15:42
  • This is a VB.NET question, but the answer is similar to @Darin's: stackoverflow.com/questions/959448/split-csv-string Commented Aug 21, 2010 at 15:51

6 Answers 6

7

I would recommend you using a CSV parser instead of rolling your own.

FileHelpers is a nice library for this job.

Sign up to request clarification or add additional context in comments.

4 Comments

Its actually a pretty simple finite state machine to make. I wrote one a few years back because ADO felt like an overkill for this task.
+1. codeproject.com/KB/database/CsvReader.aspx is a nice lightweight one. @liho1eye: it may be simple, but reinventing the wheel is not creating value for your customer.
@TrueWill thats kinda circular logic. Besides I just looked at your link and that seems almost identical to my solution... possibly much better polished, but (looking at revision log) younger than mine by at least 2 years. Not that I am trying to claim rights to this implementation. Its just goes to prove that CSV parser is pretty easy to make.
@liho1eye: If you have an alternate implementation, great! Post it as open source (if you own the copyright) and let others benefit from your efforts. What I'm saying is that there is no reason for anyone else to write a CSV parser other than (a) as a learning exercise, (b) none is available on their platform, or (c) existing parsers do not meet their business requirements.
6

You can use a regular expression to pick out the values from the line:

string line ="543472,\"36743721\",\"Rutois, a.s.\",\"151\",\"some name\",\"01341\",55,\"112\",1";
var values = Regex.Matches(line, "(?:\"(?<m>[^\"]*)\")|(?<m>[^,]+)");
foreach (Match value in values) {
  Console.WriteLine(value.Groups["m"].Value);
}

Output:

543472
36743721
Rutois, a.s.
151
some name
01341
55
112
1

This of course assumes that you actually have got the complete CSV record in the string. Note that values in a CSV record can contain line breaks, so getting the records from a CSV file can not be done by simply splitting it on line breaks.

4 Comments

I've checked your regex and it fails in this case: "text1,\"text2,\"text3" It should return values: text1 | "text2 | "text3 but it returns: text1 | text2, | text3
@Bronek: Why do you think that it should do that?
...'cause it should not divide this data after the second quotation mark. In this case comma is assumed as delimiter.
@Bronek: That input is invalid, so the expected result isn't defined.
1

you can connect to file using odbc check this

link (If link does not help much just google it "connecting csv files with odbc")

If you have problems in odbc also i guess the file is not a valid csv file.

2 Comments

That could be great.. the problem is that the csv file is not valid ! .. thanks to the file creator..
@PaN1C_Showt1Me: The example CSV is valid. The field containing a comma is enclosed in double-quotes. See en.wikipedia.org/wiki/Comma-separated_values
1

I'd be tempted to swap out the quotes that occur inside the quoted strings and then use split. this would work.

        string csv = "543472,\"36743721\",\"Rutois, a.s.\",\"151\",\"some name\",\"01341\",55,\"112\",1"; 


        const string COMMA_TOKEN = "[COMMA]";
        string[] values;
        bool inQuotes = false;

        StringBuilder cleanedCsv = new StringBuilder();
        foreach (char c in csv)
        {
            if (c == '\"')
                inQuotes = !inQuotes;  //work out if inside a quoted string or not
            else
            {
                //Replace commas in quotes with a token
                if (inQuotes && c == ',')
                    cleanedCsv.Append(COMMA_TOKEN);
                else
                    cleanedCsv.Append(c);
            }
        }

        values = cleanedCsv.ToString().Split(',');

        //Put the commas back
        for (int i = 0; i < values.Length; i++)
            values[i] = values[i].Replace(COMMA_TOKEN, ",");

2 Comments

I've checked your solution and it fails in this case: "text1,\"text2,\"text3" It should return values: text1 | "text2 | "text3 but it returns: text1 | text2,text3
The above code while not useful where non paired double quotes are used, it was very helpful to me, it allowed me to read over 30 text file with different file formats that sometimes had commas in double quotes. I was able to parse the fields correctly and then create excel files using EPPlus. One thing I had to add, was to remove an equal sign at the beginning of fields when the column contained leading zeros in numeric columns to preserve the extra zeros e.g. ="00003322".
0

I'm guess you want something like this -

string csv = 543472,"36743721","Rutois, a.s.","151","some name","01341",55,"112",1 ;
string[] values;
values = csv.Split(",");
for(int i = 0; i<values.Length; i++)
{
    values[i] = values[i].Replace("\"", "");
}

Hope this helps.

2 Comments

except you going to split on all the commas inside values to
How can I check to see if this works? Console.WriteLine(values);?
0

The other RegEx answer will fail if the first character is a quote.

This is the correct regular expression:

string[] columns = Regex.Split(inputRow, ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.