数据格式转换 (整理中)

Xing Abao Lv3

通过数据格式转换,转换为标准的 TwoSampleMR、SMR、METAL 或 MTAG 格式数据文件,或由这四种格式再转换获得的.sumstats.gz格式数据或其他格式数据。

通用 GWAS 数据清洗与格式转换

这部分适用于所有 GWAS 数据格式转换,类似于 TwoSampleMR 中的format_data()函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# format.data.v01.R
sum.dat = format_dat(
MTAG_dat = TRUE,
dat = "E:/QTLMR/GWAS/GCST90203354.h.tsv.gz",
snp_col = "rsid",
chr_col = "chromosome",
pos_col = "base_pair_location",
effect_allele_col = "effect_allele",
other_allele_col = "other_allele",
beta_col = "beta",
se_col = "standard_error",
eaf_col = "effect_allele_frequency",
pval_col = "p_value",
GWAS_name = 'GCST90203354',
save_path = 'E:/QTLMR/GWAS/MTAG/'
)

FinnGen 数据格式转换

芬兰 R7-R12 汇总信息数据:data/finngen_dat.xlsx

1
2
3
4
5
6
7
8
9
head(finngen_dat)
# phenocode name category num_cases num_controls path_https Version filename
# <char> <char> <char> <num> <num> <char> <char> <char>
# 1: HEIGHT_IRN Height, inverse-rank normalized Quantitative endpoints 292707 0 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_HEIGHT_IRN.gz R10 finngen_R10_HEIGHT_IRN.gz
# 2: WEIGHT_IRN Weight, inverse-rank normalized Quantitative endpoints 297440 0 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_WEIGHT_IRN.gz R10 finngen_R10_WEIGHT_IRN.gz
# 3: BMI_IRN Body-mass index, inverse-rank normalized Quantitative endpoints 290820 0 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_BMI_IRN.gz R10 finngen_R10_BMI_IRN.gz
# 4: I9_HYPTENS Hypertension IX Diseases of the circulatory system (I9_) 122996 289117 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_I9_HYPTENS.gz R10 finngen_R10_I9_HYPTENS.gz
# 5: T2D Type 2 diabetes, definitions combined Diabetes endpoints 65085 335112 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_T2D.gz R10 finngen_R10_T2D.gz
# 6: RX_ANTIHYP Antihypertensive medication - note that there are other indications Drug purchase endpoints 222561 189620 https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_RX_ANTIHYP.gz R10 finngen_R10_RX_ANTIHYP.gz

可以先查询finngen_dat获取对应表型数据,然后使用浏览器下载数据,再进行数据转换。

注意,该数据是基于人类基因组版本:GRCh38 。

  • Title: 数据格式转换 (整理中)
  • Author: Xing Abao
  • Created at : 2025-04-30 20:52:22
  • Updated at : 2025-05-04 12:47:38
  • Link: https://bioinformatics.vip/2025/04/30/post-GWAS/数据格式转换/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments