Splice junctions were respectively characterized in A2、D5、TM-1. In A2 We indentified 144558 SJs from 21187 gene models defined by Iso-seq data compared with SJs(165885) in the reference annotion,and there are 37884 novel SJs. In D5 We identified 127760 SJs from 19080 gene model sdefined by Iso-Seq data compared with Sjs 175343 in the reference annotion ,and there are 19336 novel SJs. In TM-1 We identified 226147 SJs from 32311 gene models defined By Iso-seq data compared with SJs 332822 in the reference annotion and there are 42138 novel SJs. The signature of terminal dinucleotide was investiated for all SJs defined By Iso-Seq data. We found that the GT-AG type general occupied a dominant poportion of intron borders, 91.0% A2 、92.3% D5、95.9% TM1, which consistentwith the previous study in other specises.
Merged Iso-Seq annotion with reference annotion and used a customed python script to identify Alter splice(AS).
At isform level, we identified 83881 new full-length transcripts from the 32182 reference gene models in TM-1, 69701 new full-length transcripts from the 21038 reference gene models in A2, 53393 new full-length transcripts from the 19001 reference gene models. And we found 58.4%、37.4% of genes in the original annotion were defined by a single transcrip isform in TM-1 、D5. After analysis of the Iso-Seq data, only 28.1%、18.4% of the genes defined by only a single transcript.
多聚腺苷酸位点的差异
for the first exon only the donor site is described as the first position is defined as transcription start site. Likewise, the last exon does not contain a donor splice site as the position is defined as polyadenylation site
多聚腺苷酸化
寻找motif
最小width 5 最长20
找15个
ployA统计
棉种
基因数
转录本数
大于5个的基因数
per gene
TM1
32182
78631
2394 At1172/Dt1222
2.44
A2
21038
64620
2786
3.07
D5
19001
50312
1728
2.65
Divergent structure of splicing isforms in Gossypium lineage