对同源基因的剪切事件进行分类.md

将各个剪切事件在所有棉种中进行汇总

内含子保留事件

awk '$3~/IntronR/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >intronR_count.txt 
awk '$3~/IntronR/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>intronR_count.txt 
awk '$3~/IntronR/{print $2}' ../A2/end_third  |sort |uniq -c |awk '{print $2"\t"$1}' >>intronR_count.txt

外显子跳跃

awk '$3~/ExonS/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txt 
awk '$3~/ExonS/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txt 
awk '$3~/ExonS/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txt

AltA可变的5'

awk '$3~/AltA/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltA_count.txt 
awk '$3~/AltA/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltA_count.txt 
awk '$3~/AltA/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}'  >>AltA_count.txt

AltD可变的3’

awk '$3~/AltD/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt
awk '$3~/AltD/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt
awk '$3~/AltD/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt

根据被剪切下来的片段长度来找保守的剪切事件

先统计同源基因发生IR事件时的长度与ExonS的长度

得到结果如下:

统计剪切事件为0|0的基因转录本的数目

将没有发生剪切事件,并且isform数目都是1基因做了GO富集分析

使用脚本去筛选出在各个基因组都保守的事件

Last updated

Was this helpful?