Tools#

FastQ Quality Scores

Assembler#

Assember…

Flye#

Flye is a de novo assembler.

bash:

1conda install -c bioconda flye
2flye --nano-raw \
3~/course_data/precompiled/all_guppy.fastq \
4--genome-size 1m --out-dir ./flye_output

miniasm#

Miniasm is a very fast OLC-based de novo assembler for noisy long reads.

miniasm:

1conda install -c bioconda miniasm

quickmerge#

Quickmerge will even work with hybrid assemblies made by combining long reads and Illumina short reads.

quickmerge:

1conda install -c conda-forge -c bioconda quickmerge
2# using wrapper
3merge_wrapper.py hybrid_assembly.fasta self_assembly.fasta

Trimmer#

Nanofilt#

Trimming and filtering Oxford Nanopore sequencing reads.

Nanofilt:

1NanoFilt –l 500 --headcrop 10 -q 10 < ./Q5705/data/${sample}_Blockchain_v6.0.1.fastq > ./Q5705/trimmed/${sample}_Blockchain_v6.0.1.trimmed.fastq

Porechop#

Porechop is a tool for finding and removing adapters from Oxford Nanopore reads.

Porechop:

1conda install -c bioconda porechop
2porechop -i input_reads.fastq.gz -o output_reads.fastq.gz

Graph#

VG#

minigraph#

pggb#

Alignment#

minimap2#

paftools#

paftools.js is a script that processes alignments in the PAF format, such as converting between formats, evaluating mapping accuracy, lifting over BED files based on alignment, and calling variants from assembly-to-assembly alignment. This script requires the k8 Javascript shell to run. On Linux or Mac, you can download the precompiled k8 binary with:

Winnowmap#

Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences. Winnowmap development began on top of minimap2 codebase, and since then we have incorporated the following two ideas to improve mapping accuracy within repeats.

MUMmer#

Options

dnadiff#

dnadiff readme

Consensus#

Racon#

Racon is a standalone consensus building tool that can be coupled with a fast assembler such as miniasm, which performs de novo assembly with error prone long reads without error corrections. This dramatically cut down the time needed for sequence assembly and consensus generation. Racon stands for Rapid Consensus and it can be used for PacBio and Oxford Nanopore data.

Graph#

Bandage#

How to Use Assembly Graphs with Metagenomic Datasets

SAM/BAM#

Samtools#

bash:

1samtools view -S –b ${sample}_trimmed.sam > ${sample}_trimmed.bam
2samtools sort –o ${sample}_trimmed.sorted.bam ${sample}_trimmed.bam
3samtools index ${sample}_trimmed.sorted.bam
4
5# get the total number of reads of a BAM file (may include unmapped and duplicated multi-aligned reads)
6samtools view -c SAMPLE.bam
7
8# counting only mapped (primary aligned) reads
9samtools view -c -F 260 SAMPLE.bam

SV Caller#

CuteSV#

Option .. highlight:: none

1> For ONT data:
2--max_cluster_bias_INS     100
3--diff_ratio_merging_INS   0.3
4--max_cluster_bias_DEL  100
5--diff_ratio_merging_DEL   0.3