Release 2.05.31 - 19 March 2010
SAM Report Format
- SAM format, fix problem with @RG tags where the colon in the RG:Z: tag was duplicated.
- In SAM format correct the MD field so that it conforms to the specified regular expression "[0-9]+(([ACGTN]|\^[ACGTN]+)[0-9]+)*". This required addition of 0 counts of matched bases between mismatches and at the end of the MD tag. Earlier MD:Z:31C30T3^CA3TC6TCT new MD:Z:31C30T3^CA3T0C6T0_C0T0
Soft Clipped Alignments
Soft clipping was introduced in V2.05.25 and trims alignments back to best local alignment.
- Native report format - Added reporting of the number of bases soft clipped from 3' end of alignment for soft trimmed alignments.
- Fix assert failure when using soft clipping of alignments.
- Fix problem in dinucleotide filter when used with filter scores that allowed gaps longer than 7bp in the alignment. Previous versions may have filtered out some dinucleotide reads whose alignment score was higher than the specified threshold for the filter.
Reporting Multiple Alignments per Read
- Fix problem with use of -rAll and -rExhaustive to report multiple alignments per read where occasionally multiple alignments were reported for the same location but with differing edits. One alignment might have an indel and another mismatches. This fix will also improve quality scores for some alignments.
Paired End Adapter Trimming
- Fixed a seg fault that could occur when using paired end adapter trimming and a read trimmed to less than index k-mer length.
- Allow more threads to be created using -c option than there are CPUs on the system. Previous versions limited threads to sysconf(_SC_NPROCESSORS_ONLN) which may be incorrect on some systems running hyper-threading.
MS Windows Format Files.
- Allow MS Windows format text files as input (i.e. With CR/LF line separators)
- When indexing genomes greater than 4Gbp Novoindex needs to use a step size greater than 1. If K&S are left to default this will happen automatically however it was possible to set -s 1 on the command line and then create an invalid index. This changes forces s=2 for reference genomes >4Gbp. Minimum s = INT((Reference Genome Size)/4^16)