0

I'm trying to make a function which returns a dictionary that has three keys called "first" "second" and "third". These keys all have a sequence of letters as their values. I wrote the key-value pairs down in a file that I called sequences.txt. So, they're seperated from my script. Moreover, the keys are separated from their values with a space. (So, in my function I tried to split the lines.) The issue is that when I try to read the file in my function it doesn't seem to work. Can anyone maybe see where it goes wrong?

first AGGCGAA
second TTTCGG
third GCGCGAA



def data(name):
    input_file = open(name)
    file_content = input_file.read()
    di = {}
    for line in input_file:
        fields = line.split()
        read_name = fields[0]
        read_seq = fields[1]
        di[read_name].append(read_seq)
    return di
    input_file.close()
print(data('sequences.txt'))
5
  • what error/stacktrace do you get? Commented Oct 24, 2021 at 13:09
  • try fields = line.split(' ') Commented Oct 24, 2021 at 13:10
  • I don't get an error message my terminal just prints {} Commented Oct 24, 2021 at 13:13
  • @randomer64 split() default value is the the same as split(' ') Commented Oct 24, 2021 at 13:14
  • Remove file_content = input_file.read() Commented Oct 24, 2021 at 13:29

4 Answers 4

1

First issue: You can't do anything after a return statement. Switch the closing of the file with your return.

Second issue: You can't append values to a dictionary if you're newly creating its key. Appending does only work for type like lists, but at the moment of creation, the value at the newly created key has no type.

Third issue: You need to iterate on the content of the file, not on the actual file you opened. Switch line in input_file to line in file_content.

Thus, a working function would be:

def data(name):
    file_content = open(name, "r").readlines()
    di = {}
    for line in file_content:
        fields = line.split()
        read_name = fields[0]
        read_seq = fields[1]
        di[read_name] = read_seq
    return di

If you run it, you'll get:

print(data("sequences.txt"))
>>> {'first': 'AGGCGAA', 'second': 'TTTCGG', 'third': 'GCGCGAA'}

Is this the expected output?

Sign up to request clarification or add additional context in comments.

2 Comments

Yes it is, thank you so much!
Don't forget to mark the correct solution and to indicate that your question has been answered.
1

You will have write it like this: dictionary[key] = value instead of using append

readlines() seems to be useful in your case to, since you only

def data(name):
    input_file = open(name, 'r')
    di = {}
    for line in input_file.readlines():
        fields = line.split()
        read_name = fields[0]
        read_seq = fields[1]
        di[read_name] = read_seq

    input_file.close()
    return di


print(data('sequences.txt'))

Another thing is that the code after your return statement won't run. An easy way to fix this is to write it before the statement, or use a context manager, which automatically closes the file for you. This code does the same, but is shorter, and I would way easier to read, which is always a good thing.

def data(name):
    di = {}
    with open(name, 'r') as file:
        for line in file.readlines():
            read_name, read_seq = line.split()
            di[read_name] = read_seq
    return di


print(data('sequences.txt'))

1 Comment

@Vic np. Hopefully you can read the last code I edited and try out that to
1

You should do file_content in for, as well you should use file.readlines():

def data(name):
    input_file = open(name)
    file_content = input_file.readlines()
    input_file.close()
    di = {}
    for line in file_content:
        fields = line.split()
        read_name = fields[0]
        read_seq = fields[1]
        di[read_name] = read_seq
    return di

print(data('sequences.txt'))

Comments

0

You have attempted to iterate over an _io.TextIOWrapper object. It is not an iterable, in short you didn't populate the dict. You want to iterate over file_content.

Secondly, if you do iterate over the correct object, line equals each individual character. You will get ["f"] as fields[0] of "f", and fields["1"] will throw an index error. Use input_file.read().splitlines() to get each line of the file.

After that is fixed you will still get KeyError: 'first' as you need to define the key di[read_name] = read_seq (You cannot append to a dictionary).

def data(name):
    di = {}
    with open(name, encoding="utf-8") as input_file:
        file_content = input_file.read().splitlines()
        for line in file_content:
            fields = line.split()
            di[fields[0]] = fields[1]

        return di


print(data('sequences.txt'))

Also, I think a list would suffice if you are going to reference di["first"], di["second"], and di["third"] (ls[0], ls[1], ls[2]).

def data(name):
    with open(name, encoding="utf-8") as input_file:
        file_content = input_file.read().splitlines()
        return [i.split(maxsplit=1)[1] for i in file_content]

print(data('sequences.txt'))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.