Tools
Contents
Tools#
Assembler#
Assember…
Flye#
Flye is a de novo assembler.
bash:
1conda install -c bioconda flye
2flye --nano-raw \
3~/course_data/precompiled/all_guppy.fastq \
4--genome-size 1m --out-dir ./flye_output
miniasm#
Miniasm is a very fast OLC-based de novo assembler for noisy long reads.
miniasm:
1conda install -c bioconda miniasm
quickmerge#
Quickmerge will even work with hybrid assemblies made by combining long reads and Illumina short reads.
quickmerge:
1conda install -c conda-forge -c bioconda quickmerge
2# using wrapper
3merge_wrapper.py hybrid_assembly.fasta self_assembly.fasta
Trimmer#
Nanofilt#
Trimming and filtering Oxford Nanopore sequencing reads.
Nanofilt:
1NanoFilt –l 500 --headcrop 10 -q 10 < ./Q5705/data/${sample}_Blockchain_v6.0.1.fastq > ./Q5705/trimmed/${sample}_Blockchain_v6.0.1.trimmed.fastq
Porechop#
Porechop is a tool for finding and removing adapters from Oxford Nanopore reads.
Porechop:
1conda install -c bioconda porechop
2porechop -i input_reads.fastq.gz -o output_reads.fastq.gz
Graph#
VG#
minigraph#
pggb#
Alignment#
minimap2#
paftools#
paftools.js is a script that processes alignments in the PAF format, such as converting between formats, evaluating mapping accuracy, lifting over BED files based on alignment, and calling variants from assembly-to-assembly alignment. This script requires the k8 Javascript shell to run. On Linux or Mac, you can download the precompiled k8 binary with:
Winnowmap#
Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences. Winnowmap development began on top of minimap2 codebase, and since then we have incorporated the following two ideas to improve mapping accuracy within repeats.
MUMmer#
dnadiff#
Consensus#
Racon#
Racon is a standalone consensus building tool that can be coupled with a fast assembler such as miniasm, which performs de novo assembly with error prone long reads without error corrections. This dramatically cut down the time needed for sequence assembly and consensus generation. Racon stands for Rapid Consensus and it can be used for PacBio and Oxford Nanopore data.
Graph#
Bandage#
SAM/BAM#
Samtools#
bash:
1samtools view -S –b ${sample}_trimmed.sam > ${sample}_trimmed.bam
2samtools sort –o ${sample}_trimmed.sorted.bam ${sample}_trimmed.bam
3samtools index ${sample}_trimmed.sorted.bam
4
5# get the total number of reads of a BAM file (may include unmapped and duplicated multi-aligned reads)
6samtools view -c SAMPLE.bam
7
8# counting only mapped (primary aligned) reads
9samtools view -c -F 260 SAMPLE.bam
SV Caller#
CuteSV#
Option .. highlight:: none
1> For ONT data:
2--max_cluster_bias_INS 100
3--diff_ratio_merging_INS 0.3
4--max_cluster_bias_DEL 100
5--diff_ratio_merging_DEL 0.3