extract string from file at specific line in python

Question

I'm trying to extract unit information from a text file. This function always returns 'm' regardless of the real unit in the file. What am I doing wrong?

def get_seba_unit(file):
    with open(file) as f:
        unit = ''
        lines = f.readlines()
        if lines[10].find('m'):
            unit = 'm'
        elif lines[10].find('cm'):
            unit = 'cm'
        elif lines[10].find('°C'):
            unit = '°C'
        print('found Unit: ' + unit + ' for sensor: ' + file)
        return(unit)

what does the line say? it's looking for an 'm' anywhere in the line, not just at the place you want it to look. — Corley Brigman
– Corley Brigman, Commented Mar 14, 2017 at 14:23
find returns position of occurence or -1 if sequence not found. -1 in if is interpreted as True. — Dmitry
– Dmitry, Commented Mar 14, 2017 at 14:27
and in your case first if statement will always get true if that line contains m character not in 0 index and all other if statement will get neglected — Anand Tripathi
– Anand Tripathi, Commented Mar 14, 2017 at 14:28

Arkady · Accepted Answer · 2017-03-14 14:28:23Z

1

This does not do what you think it does:

if lines[10].find('m'):

find returns the index of the thing you are looking for, or -1 if it's not found. So unless m is the first character on the line (index 0), your condition will always be True (In Python a non-zero number is truthy)

You might want to try if 'm' in line[10] instead

Also, check for cm before m, otherwise you'll never find cm

answered Mar 14, 2017 at 14:28

Arkady

15.2k8 gold badges45 silver badges47 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BPL · Accepted Answer · 2017-03-14 14:41:16Z

If what you're looking for is a way to extract out units from your data, i'd use some simple regex like the below one:

import io
import re
from collections import defaultdict

data = io.StringIO("""

1cm

2m

3°C

1cm 10cm

2m 20m

3°C           30°C

""")


def get_seba_unit(file):
    floating_point_regex = "([-+]?\d*\.\d+|\d+)"
    content = file.read()
    res = defaultdict(set)

    for suffix in ['cm', 'm', '°C']:
        p = re.compile(floating_point_regex + suffix)
        matches = p.findall(content)
        for m in matches:
            res[suffix].add(m)

    return dict(res)

print(get_seba_unit(data))

And you'd get an output like this one:

{'cm': {'1', '10'}, '°C': {'3', '30'}, 'm': {'2', '20'}}

Of course, the above code is just assuming your units will be floating point units but the main idea would be attacking this problem using regular expressions.

Collectives™ on Stack Overflow

extract string from file at specific line in python

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related