cabal update cabal install <program>
should Just Work. For more information, see this page.
Some packages are made available as Debian .deb packages. They may work on Ubuntu and other Debian-derived systems as well. They are probably not quite up to date, but if they work, probably less hassle. They can be found at http://malde.org/~ketil/debian.
List of Applications
A50 is a utility to evaluate genome assemblies. As an alternative to N50, which only provides a single number, A50 generates a graph by ordering all contigs by size, and plotting the cumulative size of the assembly vs contig numbers. This makes it easy to compare assemblies at a glance.
A50 uses the bio library.
Flower reads SFF files from 454 sequencing and can produce a variety of outputs. This includes a textual format intended to make the raw flowgram information easily accessible. Flower is now part of the biosff package.
FlowSim is a simulator pipeline for 454 pyrosequencing. It comes with a separate tool, clonesim, that simulates clones as random fragments of the input sequences, and flowsim which simulates the pyrosequencing reaction and generates the corresponding SFF file.
The development version (available from the darcs repo) supports quality clipping, non-uniform clone coverage, adapter sequences (which aren’t always properly clipped by the 454 pipeline), and PCR mutations.
BAM files contain alignments of (short) reads against a reference. This tool helps to evaluate their quality by generating various statistics and plots.
transalign - more sensitive pairwise alignment
Transalign is a program to generate highly sensitive alignments (typically to some curated sequence database) by using a large intermediate database (typically NR or UniProt). See also this PLoS ONE article
kmc - a k-mer counter
kmc is a k-mer counter, it reads nucleotide sequences in Fasta or FastQ format to generate an index, and can then extract various information from this.
varan - a variant analyzer
In addition to measures like Fst and nucleotide diversity, it also implements a set of measures based on allele frequency confidence intervals, including expected site information.