1

I need to split large CSV files by the source field and name the export files the name as the source field.

My code works, but the only thing that's not working is I need the split files to have the header row from the original file.

Any help is appreciated. Thank you.

var splitQuery = from line in File.ReadLines(@"C:\test\test1.csv")
            let source = line.Split(',').Last()
            group line by source into outputs
            select outputs;

foreach (var output in splitQuery)
{
    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", output);
}

Im not sure how to add a snippet of the CSV but ive put a snippet of the header fields, hope this helps

ID ,Ref ,Title ,Initials ,Forename ,Surname ,File_Source

4
  • Does test1.csv have the headers in the first line of the file? Commented Sep 29, 2016 at 11:10
  • 2
    Could you provide a small snippet sample of your CSV file, so we know what we're working with? Commented Sep 29, 2016 at 11:11
  • Please note that this implementation of CSV parsing is error prone. Although the CSV format is not a standard, values that contain commas are typically quoted (using double quotes) to escape the comma from being treated as a field delimiter. You might want to consult RFC4180 for details on field handling. Commented Sep 29, 2016 at 11:15
  • Hello, yes the headers are in the first line of the csv file Commented Sep 29, 2016 at 11:17

2 Answers 2

2

I'm strongly recommend to use specialized library for parsing CSV files that handles first line as headers and everything else. CSV format is not simple as it might look from the first sight - for example, values may be in quotes ("value"), and quotes may be escaped inside values.

Personally I prefer to use CSVHelper - it is suitable both for classic .NET and .NET Core:

using (var fileRdr = new StreamReader(@"C:\test\test1.csv")) {
    var csvRdr = new CsvReader( fileRdr, 
                       new CsvConfiguration() { HasHeaderRecord = true } );
    while( csvRdr.Read() )
    {
        // list of csv headers
        var csvFields = csvRdr.FieldHeaders

        // get individual value by field name
        var sourceVal = csvRdr.GetField<string>( "File_Source" );

        // perform your data transformation logic here 
    }   
}
Sign up to request clarification or add additional context in comments.

Comments

1

Simply read the header line first:

var fileLinesIterator = File.ReadLines(...);

string headerLine = fileLinesIterator.Take(1);

Then prepend it to every output:

var splitQuery = from line in fileLinesIterator

// ...


    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", headerLine + "\r\n" + output);

But apart from that, you don't want to be handling CSV files as mere (lines of) strings. You're bound to running into running into trouble with quoted and multiline values.

1 Comment

Thank yous all for you help :-)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.