# 20200102GO富集分析

对分好类的基因对进行GO功能富集分析

## 第一类conserve pattern

### 1.在各个基因组都存在相同的IR

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=863404143>

```bash
awk -F "_" '{print $1"_"$2}' ../converse/1_1 |sort|uniq|xargs  -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold |awk '{print $1"\n"$3}' |xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >1_1.go
```

### 2.在各个基因组中都不存在IR的基因对

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=913752380>

```bash
awk '{print $1"\n"$3}' ../converse/1_2|sort|uniq|xargs  -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot  >1_2.go
```

### 3.在A基因组中存在保守的IR

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=596719406>

```bash
awk -F "_" '{print $1}' ../converse/1_3 |sort|uniq|xargs  -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold |awk '{print $1"\n"$3}' |xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot  >1_3.go
```

### 4.在D基因组中存在保守的IR

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=642301472>

```bash
awk -F "_" '{print $1"_"$2}' ../converse/1_4 |sort|uniq|xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >1_4.go
```

## 第二类IR在多倍化中发生丢失

### 1.只在Dt中发生IR丢失

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=819544063>

```bash
awk -F "_" '{print $1"_"$2}' ../converse/2_1 | sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot > 2_1.go
```

### 2.只在At中发生IR丢失

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=472631356>

```bash
awk -F "_" '{print $1"_"$2}' ../converse/2_2 | sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot > 2_2.go
```

### 3.在两个基因组中都发生了丢失

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=415512684>

```bash
## 将第四类的情况也统计到这里
 awk -F "_" '{print $1}' ../converse/2_3|sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >2_3.go

## 只在A2中发生IR的事件
cut -f5 ../converseBed/A2.bed | awk -F "_" '{print $1}'|sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >>2_3.go
## 只在D5中发生的IR事件
cut -f5 ../converseBed/D5.bed |awk -F "_" '{print $1}'|sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >>2_3.go

sort 2_3.go |uniq >1 
mv 1 2_3.go
```

## 第三类IR在多倍化中获得新的IR事件

### 1.只在At中发生IR获得

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=893002877>

```bash
cut -f5 ../converseBed/At.bed |awk -F "_" '{print $1"_"$2}' |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >3_1.go
```

### 2.只在Dt中发生IR获得

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=566100874>

```bash
cut -f5 ../converseBed/Dt.bed |awk -F "_" '{print $1"_"$2}' |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}' | xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >3_3.go
```

### 3.在两个基因组中都发生了获得

> <http://bioinfo.cau.edu.cn/agriGO/analysis_precheck.php?session=205614710>

这里包括第四类中的前两类

```bash
cat ../converse/3_2  ../converse/4_1  ../converse/4_2 |awk -F "_" '{print $1"_"$2}'|xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold | awk '{print $1"\n"$3}'|sort |uniq |xargs -I {} grep {} ~/genome_data/Ghirsutum_genome_HAU_v1.1/Gh_Noscagenes_GO_V3.annot >3_2.go
```

## 查看不同类别之间共有的GO号

先对文件进行筛选

```bash
##根据FDR<=0.05 筛选GO结果
awk -F "\t" '$8<=0.01&&$9<=0.01{print $1,$2,$3,$8,$9}' OFS="\t"
```

#### partition splicing pattern

二倍体直系同源基因都存在多种剪切模式，在多倍化之后，分别只继承在单个亚基因组中，说明可变剪切在亚基因组上发生了partition

```bash
## 提取对应的基因
grep GO:0010027 raw/2_1.txt |awk -F "\t" '{print $10}'|sed -e 's/\/\//\n/g' -e 's/ //g'
```

## 2020-01-07

讨论多倍化过程中AS的变化

### 两个亚基因组出现同样AS

#### 同时存在AS

* 1\_1&#x20;
* 3\_2
* 4\_1
* 4\_2

```bash
awk -F "_" '{print $1"_"$2}' ../converse/1_1 |sort|uniq|xargs  -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold >geneid/1_1.txt

cat ../converse/3_2  ../converse/4_1  ../converse/4_2 |awk -F "_" '{print $1"_"$2}'|xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold |sort|uniq >geneid/3_2.txt 

 awk -F "_" '{print $1"_"$2}' ../converse/4_1 |sort|uniq|xargs  -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold >geneid/4_1.txt

 awk -F "_" '{print $1"_"$2}' ../converse/4_2 |sort|uniq|xargs  -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold >geneid/4_2.txt
```

#### 同时不存在AS

四个基因组都不存在AS的基因对先不分析

* 2\_3
* 4\_3
* 4\_4

```bash
 awk -F "_" '{print $1}' ../converse/2_3|sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/2_3.txt

 awk -F "_" '{print $1}' ../converse/4_3 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/4_3.txt

  awk -F "_" '{print $1}' ../converse/4_4 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/4_4.txt
```

### 两个亚基因组有不同的AS

#### 只在At中存在

* 1\_3
* 2\_1
* 3\_1
* 5\_2

```bash
  awk -F "_" '{print $1}' ../converse/1_3 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/1_3.txt

  awk -F "_" '{print $1"_"$2}' ../converse/2_1 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/2_1.txt

   awk -F "_" '{print $1"_"$2}'  ../converse/3_1 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/3_1.txt

     awk -F "_" '{print $1"_"$2}'  ../converse/5_2 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/5_2.txt
```

#### 只在Dt中存在

* 1\_4
* 2\_2
* 3\_3
* 5\_1

```bash
awk -F "_" '{print $1"_"$2}' ../converse/1_4 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/1_4.txt

awk -F "_" '{print $1"_"$2}' ../converse/2_2 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/2_2.txt

awk -F "_" '{print $1"_"$2}' ../converse/3_3 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/3_3.txt

awk -F "_" '{print $1"_"$2}' ../converse/5_1 |sort | uniq | xargs -I {} grep {} ../../../../GhDt_Gr_GhAt_Ga_end_noScaffold  >geneid/5_1.txt
```

### 分别进行GO富集分析

在At中有的可变剪切，而Dt中没有可变剪切；


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zpliu.gitbook.io/booknote/ke-bian-jian-qie/shu-ju-chu-li-liu-cheng/20200102go-fu-ji-fen-xi.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
