计算测序数据的分子覆盖率

计算测序数据的 coverage

配置环境

1
conda create -n bamtools -c bioconda bamtools datamash -y

运行

1
2
3
4
5
6
7
8
9
# conda activate denovo_asm
# conda activate bamtools
# convert subreads.bam to subreads.fasta
bamtools convert -format fasta -in movie.subreads.bam -out movie.subreads.fasta
# generate fasta file with just as single, median-length subread(生成包含单个中值长度子读取的 fasta 文件)
python -m falcon_kit.mains.fasta_filter median movie.subreads.fasta > movie.median.fasta

samtools faidx movie.median.fasta
cut -f2 movie.median.fasta.fai | datamash sum 1 > movie.umy

最终得到 unique molecular yield ;并将其除以 genome size 便是覆盖率。

计算公式