0

With my code, I loop over files and count patterns in files. My code is as follows

from collections import defaultdict
import csv, os, re
from itertools import groupby
import glob


   def count_kmers(read, k):
        counts = defaultdict(list)
        num_kmers = len(read) - k + 1
        for i in range(num_kmers):
            kmer = read[i:i+k]
            if kmer not in counts:
                counts[kmer] = 0
            counts[kmer] += 1
        for item in counts:
            return(basename, sequence, item, counts[item])

    for fasta_file in glob.glob('*.fasta'):
        basename = os.path.splitext(os.path.basename(fasta_file))[0]
        with open(fasta_file) as f_fasta:
            for k, g in groupby(f_fasta, lambda x: x.startswith('>')):
                if k:
                    sequence = next(g).strip('>\n')
                else:
                    d1 = list(''.join(line.strip() for line in g))
                    d2 = ''.join(d1) 
                    complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
                    reverse_complement = "".join(complement.get(base, base) for base in reversed(d1))
                    d3 = list(''.join(line.strip() for line in reverse_complement))
                    d4 = ''.join(d3)
                    d5 = (d2+d4)
                    counting = count_kmers(d5, 5)
                    with open('kmer.out', 'a') as text_file:
                        text_file.write(counting)

And my output looks like this

1035 1 GAGGA 2
1035 1 CGCAT 1
1035 1 TCCCG 1
1035 1 CTCAT 2
1035 1 CCTGG 2
1035 1 GTCCA 1
1035 1 CATGG 1
1035 1 TAGCC 2
1035 1 GCTGC 7
1035 1 TGCAT 1

The code works fine, but I cannot write my output to a file. I get the following error:

    TypeError                                 Traceback (most recent call last)
<ipython-input-190-89e3487da562> in <module>()
     37                 counting = count_kmers(d5, 5)
     38                 with open('kmer.out', 'w') as text_file:
---> 39                     text_file.write(counting)

TypeError: write() argument must be str, not tuple

What am I doing wrong and how can I solve this problem, to make sure that my code write the output to a txt file?

3
  • 1
    You aren't returning anything from your functions, just printing things to the screen. Without an explicit return statement, a Python function returns None, which is what counting = count_kmers(d5, 5) will do, and when you try to ''".join(None) you will get that error Commented Sep 25, 2017 at 19:38
  • I changed my code (used return instead of print), but than I get the error that my write argument must be str and not tuple? Commented Sep 25, 2017 at 19:45
  • Seems like a pretty straightforward error to debug then... Commented Sep 25, 2017 at 19:45

1 Answer 1

4

The original verions of count_kmers() did not contain a return statement, which means it has an implicit return None.

As you assign this to counting all of your errors became self explanatory.

After your edit, the end of the function looked like this:

for item in counts:
    return(basename, sequence, item, counts[item])

which returns a tuple with four values. It also exits the function on the first pass through the loop.

Sign up to request clarification or add additional context in comments.

2 Comments

I changed my code (used return instead of print), but than I get the error that my write argument must be str and not tuple?
Because you are now returning a tuple - and you exit on the first pass of the loop

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.