1

I am trying to find the number of rows in a csv file that have above a certain value. The code I have goes something like

$T6=Import-Csv $file | Where-Object {$_."Value" -ge 0.6 } | Measure-Object

This works well for smaller files but for large csv files(1 GB or more) it will run forever. Is there any better way to parse csv files like this in powershell?

1
  • please edit your post to quantify 'painfully slow' and 'large CSV file'. Good luck. Commented May 24, 2012 at 0:38

3 Answers 3

4

Import-Csv is the official cmdlet for this. One comment though, everything imported is a string, so you better cast the Value property to the correct type. For instance:

$T6 = Import-Csv $file | Where-Object { [float]$_.Value -ge 0.6 } | Measure-Object
Sign up to request clarification or add additional context in comments.

2 Comments

Casting the Value will increase performance dramatically. Just tested this. Literally increase performance 1000-fold.
with your import-csv and your cast, it's equivalent to ReadAllText solution on big files?
2

You can try to get rid of Import-Csv:

$values = ([System.IO.File]::ReadAllText('c:\pst\New Microsoft Office Excel Worksheet.csv')).Split(";") | where {$_ -ne ""}

$items = New-Object "System.Collections.Generic.List[decimal]" 

foreach($value in $values)
{
    [decimal]$out = New-Object decimal
    if ([System.Decimal]::TryParse($value, [ref] $out))
      {
         if ($out -ge 10){$items.Add($out)}
      } 
   }
$items | Measure-Object

Comments

2

For speed when processing large files consider using a streamreader, Roman's answer here demonstrates usage.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.