List of strings to integers while keeping a format in python

Question

So what I want to do seems relatively simple, but for the life of me, I just can't quite get it. I have a .txt file like

4 2
6 5 1
9 4 5

And I want its information to be available to me like so (i.e. I do not need to write a new .txt file unless it would be necessary.)...

3 1
5 4 0
8 3 4

or, 1 is subtracted from every number but the formatting remains the same. There will never be a number greater than 1 in the original, so negatives won't be possible. This whole headache is due to converting indexing to begin with 0 instead of 1. What may complicate things is that the original file prints like

['4 2 /n','6 5 1 /n', '9 4 5 /n']

What I've Done

Well its a mishmash of different things I've found on StackOverflow, but I think I'm going about it in the most cumbersome way possible. And this one didn't make sense when I implemented it.. although it may be on the same track with the issue with spaces..

origianl = open(file, 'r')
for line in original.readlines():
    newline = line.replace(" \n","")
    finalWithStrings.append(newline)

finalWithIntegers = [map(int,x) for x in finalWithStrings]
finalWithIntegers[:] = [x-1 for x in finalWithIntegers]

My thought process was, I need to remove the "/n" and to convert these strings into integers so I can subtract 1 from them. And somehow keep the formatting. It's important to have the formatting be the same since each line contains information on the similarly indexed line of another file. I don't want to see the "/n" in the end result (or print statement) but I still want the effect of a new line beginning. The above code however, wont work for two reasons (that I know of).

int(n[:]) throws an error since it doesn't like the spaces and when I put a value (say 0) in there, then the code prints the first number on each of the lines and subtracts one.. and puts it all on one line.

[3, 5, 8]

So, it seems redundant to take out a carriage return and have to throw another in, but I do need to keep the formatting, as well as have a way to get all the numbers!

This also didn't work:

for line in original.readlines():
    newline = line.replace(" \n","")
    finalWithStrings.append(newline)

finalWithIntegers = [map(int,x) for x in finalWithStrings]
finalWithIntegers[:] = [x-1 for x in finalWithIntegers]

but instead of just a wrong output it was an error:

ValueError:invalid literal for int() with base 10:''

Does anyone have any ideas on what I'm doing wrong here and how to fix this? I am working with Python 2.6 and am a beginner.

mgilson · Accepted Answer · 2012-08-03 17:44:29Z

9

with open("original_filename") as original:
    for line in original:
        #if you just want the line as integers:
        integers = [ int(i) - 1 for i in line.split() ]
        #do something with integers here ...

        #if you want to write a new file, use the code below:
        #new_line = " ".join([ str(int(i) - 1) for i in line.split() ])
        #newfile.write(new_line + '\n')

I've opened your file in a context manager in the above example because that is good practice (since version 2.5). The context manager makes sure that your file is properly closed when you exit that context.

EDIT

It looks like you might be trying to create a 2D list ... To do that, something like this would work:

data = []
with open("original_filename") as original:
    for line in original:
        integers = [ int(i) - 1 for i in line.split() ]
        data.append(integers)

Or, if you prefer the 1-liner (I don't):

with open("original_filename") as original:
    data = [ [int(i) for i in line.split()] for line in original ]

Now if you print it:

for lst in data:
    print (lst)    # [3, 1]
                   # [5, 4, 0]
                   # [8, 3, 4]

edited Aug 3, 2012 at 17:44

answered Aug 3, 2012 at 17:32

mgilson

312k70 gold badges656 silver badges722 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Ason Over a year ago

Perfect, this is exactly what I was looking for! Thank you for the edit.. it was more specific to what I needed.

mgilson Over a year ago

@Ason -- No problem. I re-read your post a little more carefully and came across the line that said you didn't need it in a new file unless that was the easiest way to accomplish this. So, I updated.

mgilson Over a year ago

@Ason -- I also condensed it down to a 1-liner (and added that as an alternative). I don't prefer it to the multi-line version, but it's not too bad so there might be some who like it better.

Ason Over a year ago

@mgilson as a beginner, I like to see more of what I'm doing, so I'll stick with the multi-line, but thank you for adding more information for future users!

Ason Over a year ago

@mgilson just to clarify, is it necessary to convert the strings to integers to subtract 1? I read that strings are unchangeable.

|

Andrew Clark · Accepted Answer · 2012-08-03 18:43:37Z

4

Here is a pretty straight forward way to accomplish this using regular expressions. The benefit here is that the formatting is guaranteed to stay exactly the same because it will replace the numbers in place without touching any of the whitespace:

import re

def sub_one_repl(match):
    return str(int(match.group(0))-1)

for line in original.readlines():
    newline = re.sub(r'\d+', sub_one_repl, line).rstrip('\n')

edited Aug 3, 2012 at 18:43

answered Aug 3, 2012 at 17:39

Andrew Clark

210k36 gold badges284 silver badges310 bronze badges

3 Comments

Ason Over a year ago

Thank you so much for your answer! I'm not very familiar with regular expressions, so I'll have to select a different answer as it was easier to understand and implement.. but +1 for helping future visitors!

Adam Parkin Over a year ago

Great idea, though I think you mean match.group and not m.group. As well, you might want to make sub_one_repl either a little more safe (ie if the regex fails to match the .group will cause an exception) or just do a lambda. As well you could do it as a list comp or generator expression: (re.sub(r'\d+', lambda m: str(int(m.group(0))-1), line) for line in original.readlines())

Andrew Clark Over a year ago

@AdamParkin - Thanks, I originally had m as the argument and forgot to update the function. sub_one_repl will only be called on successful matches, which will always be all digits, so it should be safe as it is. One-line is an option but I would still move the lambda outside of it so you aren't recreating the function on each iteration.

rcovre · Accepted Answer · 2012-08-03 18:00:43Z

2

Another way is to use the csv module and list comprehension:

from csv import reader

data = [[int(j) - 1 for j in i] for i in reader(open("your_file"), delimiter=' ')]

It results, for example, using your data:

[[3, 1], [5, 4, 0], [8, 3, 4]]

answered Aug 3, 2012 at 18:00

rcovre

694 bronze badges

Comments

inspectorG4dget · Accepted Answer · 2012-08-03 17:43:39Z

0

Try this:

with open(filepath) as f:
    for line in f:
        print " ".join([str(int(i)-1) for i in line.split()])

Hope that helps

answered Aug 3, 2012 at 17:43

inspectorG4dget

115k30 gold badges158 silver badges252 bronze badges

Collectives™ on Stack Overflow

List of strings to integers while keeping a format in python

4 Answers 4

8 Comments

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

3 Comments

Comments

Comments

Linked

Related