对同源基因的剪切事件进行分类.md
将各个剪切事件在所有棉种中进行汇总
内含子保留事件
awk '$3~/IntronR/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >intronR_count.txt
awk '$3~/IntronR/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>intronR_count.txt
awk '$3~/IntronR/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>intronR_count.txt外显子跳跃
awk '$3~/ExonS/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txt
awk '$3~/ExonS/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txt
awk '$3~/ExonS/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>exonSkip_count.txtAltA可变的5'
awk '$3~/AltA/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltA_count.txt
awk '$3~/AltA/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltA_count.txt
awk '$3~/AltA/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltA_count.txtAltD可变的3’
awk '$3~/AltD/{print $2}' ../TM-1/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt
awk '$3~/AltD/{print $2}' ../D5/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt
awk '$3~/AltD/{print $2}' ../A2/end_third |sort |uniq -c |awk '{print $2"\t"$1}' >>AltD_count.txt根据被剪切下来的片段长度来找保守的剪切事件
先统计同源基因发生IR事件时的长度与ExonS的长度
得到结果如下:
统计剪切事件为0|0的基因转录本的数目
0|0的基因转录本的数目将没有发生剪切事件,并且isform数目都是1基因做了GO富集分析
使用脚本去筛选出在各个基因组都保守的事件
Last updated
Was this helpful?