I need some help understanding a function that i want to use but I'm not entirely sure what some parts of it do. I understand that the function is creating dictionaries from reads out of a Fasta-file. From what I understand this is supposed to generate pre- and suffix dictionaries for ultimately extending contigs (overlapping dna-sequences). The code:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
dict = {}
multipleKeys = []
i = 1
for read in reads:
if read[0:lenKeys] in dict:
multipleKeys.append(read[0:lenKeys])
else:
dict[read[0:lenKeys]] = read[lenKeys:]
if verbose:
print("\rChecking suffix", i, "of", len(reads), end = "", flush = True)
i += 1
for key in set(multipleKeys):
del(dict[key])
if verbose:
print("\nCreated", len(dict), "suffixes with length", lenSuffix, \
"from", len(reads), "Reads. (", len(reads) - len(dict), \
"unambigous)")
return(dict)
Additional Information: reads = readFasta("smallReads.fna", verbose = True)
This is how the function is called:
if __name__ == "__main__":
reads = readFasta("smallReads.fna", verbose = True)
suffixDicts = makeSuffixDicts(reads, 10)
The smallReads.fna file contains strings of bases (Dna):
"> read 1
TTATGAATATTACGCAATGGACGTCCAAGGTACAGCGTATTTGTACGCTA
"> read 2
AACTGCTATCTTTCTTGTCCACTCGAAAATCCATAACGTAGCCCATAACG
"> read 3
TCAGTTATCCTATATACTGGATCCCGACTTTAATCGGCGTCGGAATTACT
Here are the parts I don't understand:
lenKeys = len(reads[0]) - lenSuffix
What does the value [0] mean? From what I understand "len" returns the number of elements in a list. Why is "reads" automatically a list? edit: It seems a Fasta-file can be declared as a List. Can anybody confirm that?
if read[0:lenKeys] in dict:
Does this mean "from 0 to 'lenKeys'"? Still confused about the value.
In another function there is a similar line: if read[-lenKeys:] in dict:
What does the "-" do?
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
Here I don't understand the parameters: How can reads be a parameter? What is lenSuffix = 20 in the context of this function other than a value subtracted from len(reads[0])?
What is verbose? I have read about a "verbose-mode" ignoring whitespaces but i have never seen it used as a parameter and later as a variable.
makeSuffixDictfunction expects thatreadsis in fact a list (if you don't pass it a list, it won't work). Do you have documentation for this function that specifies its requirements?read[:lenKeys]means "everything inreadup to index numberlenKeys". Similarly,read[-lenKeys]is just an index, but using a negative operator. So, "lenKeysobjects back from the end ofread".