I want to create a bash script that takes in a dna file and checks that it has no newline characters or white space characters, and then outputs the unique codons along with their count of the number of times they occur. I have used the following code but the codon keeps giving me an output of "bash-3.2$". I am so confused as to whether my syntax is wrong and why I'm not getting the proper output.
! /bin/bash
for (( pos=1; pos < length - 1; ++pos )); do
codon = substr($1, $pos, 3)
tr-d '\n' $1 | awk -f '{print $codon}' | sort | uniq -c
done
For example if a file named dnafile contains the pattern aacacgaactttaacacg then the script will take the following input and output
$script dnafile
aac 3
acg 2
ttt 1
substr, you can't have spaces around the=in variable assignments, and your shebang is wrong. iii) remember that there are 6 possible reading frames, are you sure you only need to look at one? iv) your dna file will almost never just have sequence in it, you usually have some sort of header and extra information (fasta, fastq, sam tec.)