Xue et al. 2024, Communications Medicine
Title: Unravelling the complex causal effects of substance use behaviours on common disease

1. Xue_et_al_ukbEUR_SI_common_2024.txt.gz: Summary statistics of Smoking Initiation (SI).


2. Xue_et_al_ukbEUR_FS_common_2024.txt.gz: Summary statistics of Former Smoking (FS).


3.Xue_et_al_ukbEUR_CS_common_2024.txt.gz: Summary statistics of Current Smoking (CS).


4.Xue_et_al_ukbEUR_SC_common_2024.txt.gz: Summary statistics of Smoking Cessation (SC).


5. Xue_et_al_ukbEUR_AC_common_2024.txt.gz: Summary statistics of Alcohol Consumption (AC).


6. Xue_et_al_ukbEUR_TI_common_2024.txt.gz: Summary statistics of Tea Intake (TI).


7. Xue_et_al_ukbEUR_CI_common_2024.txt.gz: Summary statistics of Coffee Intake (CI).


8. Xue et al MR_SUB Commun Med 2024.pdf: Description of the dataset.

Qi et al. 2022, Nature Genetics
Title: Genetic control of RNA splicing and its distinctive role in complex trait variation

1. BrainMeta sQTLs: BrainMeta sQTL summary statistics (2,865 samples on 2,443 individuals).


2. BrainMeta eQTLs: BrainMeta eQTL summary statistics (2,865 samples on 2,443 individuals).


3. Qi_et_al_SMR_COLOC.tar.gz: SMR and COLOC analyses summary statistics for 12 brain-related phenotypes.

Xue et al. 2021, Nature Communications
Title: Genome-wide analyses of behavioural traits are subject to bias by misreports and longitudinal changes

1. Xue et al AC MLC bias Nat Commun 2020.tar.gz: Summary statistics of genome-wide association of alcohol consumption.


2. Xue et al AC MLC bias Nat Commun 2020.pdf: Description of the dataset.


Adolphe et al. 2020, Genome Medicine
Title: Genetic and functional interaction network analysis reveals global enrichment of regulatory T Cell genes influencing basal cell carcinoma susceptibility

1. Adolphe_Xue_et_al_BCC_Genome_Med_2020.tar.gz: Summary statistics of basal cell carcinoma (BCC) GWAS.


2. Adolphe_Xue_et_al_BCC_Genome_Med_2020.pdf: Description of the dataset.


Xue et al. 2018, Nature Communications
Title: Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes

1. Xue_et_al_T2D_META_Nat_Commun_2018.gz: GWAS summary statistics of common variants.


2. Xue_et_al_T2D_META_Nat_Commun_2018.pdf: Description of the dataset.


3. Xue_et_al_T2D_META_rare_Nat_Commun_2018.gz: GWAS summary statistics of rare variants (added on 6 Feb 2022).

Zhu et al. 2018, Nature Communications
Title: Causal associations between risk factors and common diseases inferred from GWAS summary data

The summary-level GWAS data for 23 phenotypes were from GERA and UK Biobank. Each data set has been made available as a whitespace-separate table in GCTA-COJO format. Columns are SNP, the effect allele, the other allele, frequency of the effect allele, effect size, standard error, p-value and sample size.


1. GERA data: Details of quality controls of the genotyped and imputed data can be found in Zhu et al. (2018 Nat. Commun.). The individual-level ICD-9 codes were classified into 22 common diseases. We added an additional trait ‘Disease Count’ (a count of the number of diseases affecting each individual) as a crude measure of general health status of each individual.


2. UK Biobank data: Details of quality controls of the genotyped and imputed data can be found in Zhu et al. (2018 Nat. Commun.). Individual-level ICD-10 codes were available in the UKB data. To match the diseases in GERA, we classified the phenotypes into 22 common diseases by projecting the ICD-10 codes to the classifications of ICD-9 codes in GERA taking into account the self-reported disease status. Note that we did not perform the association analysis for dermatophytosis because the number of cases was too small. We only performed the association analyses on a subset of SNPs (in common with the top associated SNPs for the risk factors) for insomnia, iron deficiency anemias, macular degeneration, peripheral vascular disease and acute reaction to stress.

Yang et al. 2015, Nature Genetics
Title: Estimation of genetic variance from imputed sequence variants reveals negligible missing heritability for human height and body mass index

1. LDSCORE_release_July2015.tar.gz: per-SNP and per-segment LD scores calculated from 44,126 unrelated indivduals and ~17M imputed variants. Columns are SNP, per-SNP LD score, and per-segment LD score.


2. GWAS_summary_release_July2015.tar.gz, GWAS summary data. Columns are SNP, the coded allele, effect size, and standard error.



Back to Data