|
|
|
|
|
| Description |
This is a meta-module importing and re-exporting sequence-related stuff.
It encompasses the Bio.Sequence.SeqData, Bio.Sequence.Fasta, and Bio.Sequence.TwoBit modules.
|
|
| Synopsis |
|
|
|
|
| Data structures etc (Bio.Sequence.SeqData)
|
|
|
| A sequence consists of a header, the sequence data itself, and optional quality data.
The type parameter is a phantom type to separate nucleotide and amino acid sequences
| | Constructors | | Instances | |
|
|
|
|
|
| An offset, index, or length of a SeqData
|
|
|
| The basic data type used in Sequences
|
|
|
| Basic type for quality data. Range 0..255. Typical Phred output is in
the range 6..50, with 20 as the line in the sand separating good from bad.
|
|
|
| Quality data is a Qual vector, currently implemented as a ByteString.
|
|
| Accessor functions
|
|
|
| Return sequence length.
|
|
|
| Return sequence label (first word of header)
|
|
|
| Return full header.
|
|
|
| Return the sequence data.
|
|
|
| Return the quality data, or error if none exist. Use hasqual if in doubt.
|
|
|
| Read the character at the specified position in the sequence.
|
|
|
|
|
| Modify the header by appending text, or by replacing
all but the sequence label (i.e. first word).
|
|
| Converting to and from String.
|
|
|
| Convert a String to SeqData
|
|
|
| Convert a SeqData to a String
|
|
| Nucleotide functionality.
|
|
|
| Complement a single character. I.e. identify the nucleotide it
can hybridize with. Note that for multiple nucleotides, you usually
want the reverse complement (see revcompl for that).
|
|
|
| Calculate the reverse complement.
This is only relevant for the nucleotide alphabet,
and it leaves other characters unmodified.
|
|
|
| Calculate the reverse complent for SeqData only.
|
|
|
| For type tagging sequences (protein sequences use Amino below)
|
|
|
|
| Phantom type functionality
|
|
| Protein sequence functionality
|
|
|
| Constructors | | Ala | | | Arg | | | Asn | | | Asp | | | Cys | | | Gln | | | Glu | | | Gly | | | His | | | Ile | | | Leu | | | Lys | | | Met | | | Phe | | | Pro | | | Ser | | | Thr | | | Tyr | | | Trp | | | Val | | | STP | | | Asx | | | Glx | | | Xle | | | Xaa | |
| Instances | |
|
|
|
| Translate a nucleotide sequence into the corresponding protein
sequence. This works rather blindly, with no attempt to identify ORFs
or otherwise QA the result.
|
|
|
| Convert a sequence in IUPAC format to a list of amino acids.
|
|
|
| Convert a list of amino acids to a sequence in IUPAC format.
|
|
|
|
| File formats
|
|
| The Fasta file format (Bio.Sequence.Fasta)
|
|
|
| Lazily read sequences from a FASTA-formatted file
|
|
|
| Lazily read sequence from handle
|
|
|
| Write sequences to a FASTA-formatted file.
Line length is 60.
|
|
|
| Write sequences in FASTA format to a handle.
|
|
| Quality data
|
|
| Not part of the Fasta format, and treated separately.
|
|
|
| Read quality data for sequences to a file.
|
|
|
| Write quality data for sequences to a file.
|
|
|
|
|
| Read sequence and associated quality. Will error if
the sequences and qualites do not match one-to-one in sequence.
|
|
|
| Write sequence and quality data simulatnously
This may be more laziness-friendly.
|
|
|
|
| The FastQ format (Bio.Sequence.FastQ)
|
|
|
|
|
|
|
|
|
|
| The phd file format (Bio.Sequence.Phd)
|
|
| These contain base (nucleotide) calling information,
and are generated by phred.
|
|
|
| Parse a .phd file, extracting the contents as a Sequence
|
|
|
| Parse .phd contents from a handle
|
|
| TwoBit file format support (Bio.Seqeunce.TwoBit)
|
|
| Used by BLAT and related tools.
|
|
|
| Parse a (lazy) ByteString as sequences in the 2bit format.
|
|
|
| Read sequences from a file in 2bit format and
| unmarshall/deserialize into Sequence format.
|
|
|
| Read sequences from a file handle in the 2bit format and
| unmarshall/deserialze into Sequence format.
|
|
| Hashing functionality (Bio.Sequence.HashWord)
|
|
| Packing words from sequences into integral data types
|
|
|
| This is a struct for containing a set of hashing functions
| | Constructors | | HF | | | hash :: SeqData -> Offset -> Maybe k | calculates the hash at a given offset in the sequence
| | hashes :: SeqData -> [(k, Offset)] | calculate all hashes from a sequence, and their indices
| | ksort :: [k] -> [k] | for sorting hashes
|
|
|
|
|
|
| Contigous constructs an int/eger from a contigous k-word.
|
|
|
| Like contigous, but returns the same hash for a word and its reverse complement.
|
|
|
| Like rcontig, but ignoring monomers (i.e. arbitrarily long runs of a single nucelotide
are treated the same a single nucleotide.
|
|
| Entropy calculations
|
|
|
| | Methods | | | Instances | |
|
|
|
|
| Produced by Haddock version 2.4.2 |