4

I have a file with numbers separated by ,(comma). In between it also contains a number range like 300-400. Say for example I have a text file, namely testme.txt which looks like,

200,300,234,340-350,400,360,333-339
409-420
4444-31231231
348

I want to find out whether number 348 is present or not. 348 is present in 2 places:

  • 340-350
  • In last line.

How to find it?. I tried using regex in sed,awk, but I am not able to completely write the script to capture the number range. Is there any other way to find it?

UPDATE: Found 1 brute force solution & it's working only for range.

count=0;
num1=348;
for i in `sed 's/\([0-9]\+\-[0-9]\+\)/:&:/g' testme.txt  | 
    awk -F: '{ for(i=1; i<=NF; i++) if($i ~/[0-9]+-[0-9]+/){print $i} }'`;      
do 
    lh=`echo $i | awk -F\- '{print $1}'`; 
    rh=`echo $i | awk -F\- '{print $2}'`;  
    if [ $lh -le $num1 -a $rh -ge $num1 ]; 
    then  
        count=`expr $count + 1`; 
    fi; 
done
echo $count;
3
  • 1
    400;360 or 400,360? Commented Jul 21, 2014 at 14:54
  • yeah its 400,360. Changing it. Commented Jul 21, 2014 at 15:06
  • Regular expressions cannot handle range of values easily. You should do numerical comparisons in a language that supports them (even awk can do that, and by setting FS you can split at both space and comma). Commented Jul 21, 2014 at 15:12

6 Answers 6

4

A GNU awk solution that treats , or \n as a record separator and - as a field separator. An equality check or a range check is applied depending on number of fields

awk -v num=348 -v RS=',|\n' -F'-' 'NF == 2 && $1 <= num && $2 >= num{c++};
           NF == 1 && $0 == num{c++};
           END{print c+0}' file
2
3

If you can use perl:

$ perl -F',' -anle '
for (@F) {
    ($l,$h) = split "-";                
    $count++ if $l == 348 || ($l < 348 and $h >= 348);
}
END {print $count}
' file
2
7
  • Your script doesn't give the right answer if one replaces the first comma by a dash. Commented Jul 21, 2014 at 15:29
  • Take the example given by the OP, then replace the first comma (between 200 and 300) by a dash. Your script gives 1 instead of 2. Commented Jul 21, 2014 at 15:34
  • @vinc17: Sorry, my bad, forget to add the loop, fixed. Commented Jul 21, 2014 at 15:39
  • for (@F) { ($l,$h) = split "-"; defined $h or $h = $l; $count++ if (sort {$a <=> $b} $l,348,$h)[1] == 348; } END {print $count} is shorter and more robust (if you have 348-348 in the file). Commented Jul 21, 2014 at 15:47
  • There is more than one way to do it! See my updated. Commented Jul 21, 2014 at 15:49
2

This answer will provide the fields that contain the specified number, not just the lines, if you are after that level of detail (and if the ranges in your data might contain overlaps):

awk -v num=348 -F, '{
  for (i=1; i<=NF; i++) {
    if ($i == num || (split($i, a, /-/) == 2 && (a[1] <= num && num <= a[2]))) {
      print $i
    }
  }
}' <<END
200,300,234,340-350,400,360,333-339
409-420
4444-31231231
348
1-400,100-1000
END
340-350
348
1-400
100-1000

For giggles, golfed:

awk -F, '{for(i=1;i<=NF;i++)if($i==n||(split($i,a,/-/)==2&&a[1]<=n&&n<=a[2]))print $i}' n=348 file
0

Possible method to approach the problem (as there are I am sure many ways to get this done) is to simplify the checks for the number.

Use nested if statements to move through the logic, naturally splitting the 'values' to check based on a comma delimeter.

If the value has a "-" then for the check, split the two numbers at the "-". Then it is a simple matter of checking to see if the number you are checking for is greater than or equal to the first number AND less than or equal to the second number. This will denote it is in the range.

For values without a "-" it is a simple check to see if it is equal.

Perhaps not an elegant approach, but it would get the job done (it seemed to me that you were looking for the method to get at the comparisons and not for the finished script itself, so I am hoping the above provides you with that brainstorming).

0

This example uses function match.

awk -F ',' '{num = 348; i = 0; while(i <= NF) {i++; match($i,/([0-9]+)-?([0-9]*)/,arr); if(arr[1] == num || (arr[1] <= num && num <= arr[2])){count++}}} END {print count}' file

0

Assuming your input is wellformed, file with list and number as parameter, this should work in PHP:

<?php
$count = 0;
foreach(explode("\n",file_get_contents($argv[1])) as $line)
foreach(explode(",",$line) as $cols)
{
    $data = split(',',$cols);
    if(((count($data)>0)&&($data[0]==$argv[2])) ||
        (count($data)>1)&&(($data[0]-$argv[2])*($data[1]-$argv[2]) < 0))
        count++;
}
echo $count;

put the code in a file script.php and call it from bash like this:

php script.php testme.txt 348

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.