# 取数据

## ipython 查看加载变量

```bash
%who_ls
```

## 获取数据框中数据

当使用`loc`获得一个数据框的时候，要想遍历这个数据集中的内容

* `iterrows`按照行获取数据，每次获得一行数据

> `iterrows`是一个生成器，每个返回的生成器中包含两个内容：
>
> * 改行的索引
> * 改行的数据list，并且list的下标是从0开始的

```python
>>>df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])
>>>row = next(df.iterrows())[1]
>>>row
int      1.0
float    1.5
Name: 0, dtype: float64
>>>print(row['int'].dtype)
float64
>>>print(df['int'].dtype)
int64
```

### 遍历一行中的列

* 使用索引进行遍历`iloc[:0]`
* 使用列名进行遍历`loc['chr']`

```python
##获取pandas中的某一行
df=d.loc[(d['chr']=="Ghir_A01") & (d['start']==8076) & (d['end']==9496)]
#获取该行对应的第3列及以后列的数据
df.iloc[:,3:]
df.loc[['列名3','列名4'...]]
```

### 使用映射函数，批量处理

对数据库进行批量处理

* map
* apply

```python
##使用apply对数据框进行批量处理
#没列数加上10
df.apply(lambda x:x+10,axis=1)
```

### 多条件筛选

筛选符合多个条件的行

> 筛选`chr`列为"Ghir\_A01"同时"start"列为8076的行

```python
>>>d.loc[(d['chr']=="Ghir_A01") & (d['start']==8076)]
    chr  start    end  0DPA_Sample001
0   Ghir_A01   8076   9496               0
```

### 获取多行多列

```python
df.loc['one':'two',['a','c']]#one到two行，ac列
```

### 去除行名中的重复

```bash
## 行名重复 index
def filterData(myDF):
    ...:     myDF['index'] = myDF.index
    ...:     myDF= myDF.drop_duplicates('index')
    ...:     myDF.set_index = myDF['index']
    ...:     myDF= myDF.drop('index', axis =1)
    ...:     return myDF
```

### 合并两个数据框

```bash
#横向连接两个数据框，按行合并
pd.concat([data1,data2],axis=0)
#纵向连接两个数据框,按列合并
pd.concat([data1,data2],axis=1)
#merge 合并
```

### 修改行名与列名

```bash
#修改列名
mergeData.columns=['HuanGangRep1', 'HuanGangRep2', 'HuanGangRep3', 'mean']
#修改行名
mergeData.index=['HuanGangRep1', 'HuanGangRep2', 'HuanGangRep3', 'mean']
```

### 调整列的顺序

```bash
```

### 按照指定列进行排序

```bash
 #! 按照第18列进行降序排序后，提取对应的p-value
pt=tmpData.sort_values(by=[18],axis=0,ascending=False).iloc[0,17]
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zpliu.gitbook.io/booknote/python/pandas/qu-shu-ju.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
