Print only lines that are completely numeric

Question

I'd like to filter through a text file and only print the lines where each column is a valid floating point number. For example:

3 6 2 -4.2 21.2 
3 x 4.2 21.2 
3 2 2.2.2

Only the first line would pass as x, nor 2.2.2 are valid floats. I can write a python script that simply .splits() and runs a try/except block over each part, but this is slow for larger files. The input file has an unknown variable length number of columns and no scientific notation will be used. Is there an awk solution?

glenn jackman · Accepted Answer · 2012-10-22 21:06:22Z

4

awk '
    # skip any obvious stuff
    /[^0-9. -]/ {next}
    {
        # test each field for a number
        for (i=1; i<=NF; i++) 
            if ($i + 0 != $i)
                next
        print
    }
'

This will break for valid numbers in scientific notation: 1.2e1 == 12

answered Oct 22, 2012 at 21:06

glenn jackman

88.5k16 gold badges124 silver badges179 bronze badges

1

One can easily add e in the regular expression [^0-9. -e]. The test will then only fail when there are only e's in the line.

Bernhard
– Bernhard

2014-06-23 07:16:37 +00:00
Commented Jun 23, 2014 at 7:16
Be careful with bracket expressions: [0-9. -e] will match any character from space (ascii 32) to e (ascii 101). You want [^0-9. e-]: to match a literal hyphen, it needs to be either the first or the last character, otherwise it defines a range of chars. (gnu.org/software/gnulib/manual/html_node/…)

glenn jackman
– glenn jackman

2014-06-23 13:05:21 +00:00
Commented Jun 23, 2014 at 13:05

Add a comment |

iruvar · Accepted Answer · 2012-10-22 20:38:13Z

2

based on the conditions you state regex might be a possibility. I was able to get the following GNU awk script to work on RHEL.

 awk '{for (i=1; i<=NF; ++i) {if ($i !~ /^[-]?[[:digit:]]+(\.[[:digit:]]+)?$/) break;if (i == NF)print($0)}}' file.txt

answered Oct 22, 2012 at 20:38

iruvar

17k8 gold badges51 silver badges81 bronze badges

Add a comment |

Axel · Accepted Answer · 2012-10-22 20:39:51Z

2

Try something like this:

$ cat data.txt 
3 6 2 -4.2 21.2 
3 x 4.2 21.2 
3 2 2.2.2

$ awk '/^\s*(-?[0-9]+(\.[0-9]*)?\s+)+\s*$/ { print }' < data.txt 
3 6 2 -4.2 21.2

answered Oct 22, 2012 at 20:39

Axel

1214 bronze badges

PS: you asked for awk. Should be using grep instead...

Axel
– Axel

2012-10-22 20:40:34 +00:00
Commented Oct 22, 2012 at 20:40

Add a comment |

Stack Exchange Network

Print only lines that are completely numeric

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Print only lines that are completely numeric

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions