Skip to content

jinghuazhao/INF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCALLOP-INF meta-analysis

A companion web site for this paper1,

Zhao, J.H., Stacey, D., Eriksson, N., Macdonald-Dunlop, E., Hedman, Å.K., Kalnapenkis, A., Enroth, S., Cozzetto, D., Digby-Bell, J., Marten, J., Folkersen, L., Herder, C., Jonsson, L., Bergen, S.E., Gieger, C., Needham, E.J., Surendran, P., Metspalu, A., Milani, L., Mägi, R., Nelis, M., Hudjašov, G., Paul, D.S., Polasek, O., Thorand, B., Grallert, H., Roden, M., Võsa, U., Esko, T., Hayward, C., Johansson, Å., Gyllensten, U., Powell, N., Hansson, O., Mattsson-Carlgren, N., Joshi, P.K., Danesh, J., Padyukov, L., Klareskog, L., Landén, M., Wilson, J.F., Siegbahn, A., Wallentin, L., Mälarstig, A., Butterworth, A.S., Peters, J.E., and Estonian Biobank Research Team (2023). Genetics of circulating inflammatory proteins identifies drivers of immune-mediated disease risk and therapeutic targets. Nature Immunology, URL https://www.nature.com/articles/s41590-023-01588-w.

Quick links to codes for figures,

Name Script
Figure
Figure 1 circos2.R
Figure 2 hotspot.sh, utils.sh, IL.12B.sh, TRAIL.sh
Figure 3 IL.18-rs385076.sh
Figure 4
Figure 5 gsmr.r
Figure 6 utils.sh
Extended Data Figure
Extended Data Figure 1
Extended Data Figure 2 utils.sh, IL.17C.R
Extended Data Figure 3 aristotle.sh
Extended Data Figure 4 h2pve.R
Extended Data Figure 5 rs12075.R
Extended Data Figure 6 utils.sh
Extended Data Figure 7
Extended Data Figure 8 pqtlGWAS.R
Extended Data Figure 9 pqtlGWAS.R
Extended Data Figure 10 utils.sh
Supplementary Figure
Supplementary Figure 1 (450dpi) qqmanhattanlz.sb, utils.sh
Supplementary Figure 2 utils.sh
Supplementary Figure 3 eQTLGen.sh
Supplementary Figure 4 coloc-disease.sh
Supplementary item js.R, merge.sh
Supplementary Tables tables.R

Flow of analysis

The diagram can also be rendered via Mermaid live editor.

graph TB;
  tryggve ==> cardio;
  cardio ==> csd3;
  csd3 --> csd3Analysis[Conditional analysis, finemapping, etc];
  csd3 --> software[R Packages at CRAN/GitHub]; 
  tryggveAnalysis[Meta analysis: list.sh, format.sh,metal.sh, QCGWAS.sh, analysis.sh] --> GWAS[pQTL selection and Characterization];
  GWAS --> Prototyping[Prototyping: INTERVAL.sh, cardio.sh, ...];
  Prototyping --> Multi-omics-analysis;
Loading

Comments

To view the code inside the browser, select the GitHub button from the menu.

The tryggve, cardio and csd3 directories here are associated with the named Linux cluster(s) used for the analysis over time. Early implementation involves the following aspects,

  1. Data pre-processing from tryggve with list.sh and format.sh, followed by meta-analysis according to metal.sh using METAL whose results were cross-examined with QCGWAS.sh together with additional investigation.
  2. The main analysis with analysis.sh containing codes for Manhattan/Q-Q/forest/LocusZoom plots, clumping using PLINK and conditional analysis using GCTA. The clumping results were classified into cis/trans signals. As the meta-analysis stabilised especially with INTERVAL reference, analysis has been intensively done locally with cardio and csd3. cis/trans classification has been done via cis.vs.trans.classification.R as validated by cistrans.sh.
  3. Prototyping analysis on cardio with INTERVAL such as INTERVAL.sh and cardio.sh as well as individual level data analysis for the KORA study. Most analyses were done locally on CSD3.

Most recent implementations are documented in Supplementary notes (rsid). Over time, many functions become part of two R packages, gap (CRAN, GitHub, vignette) and pQTLtools (Web page).

A benchmark

As revealed by PhenoScanner2, Osteoprotegerin (OPG) proves to be a positive control3 involving both a cis and a trans pQTLs, with the cis-pQTL showing stronger association -- see a stacked image containg forest+LocusZoom and Manhattan+Q-Q plots, OPG.pdf.

Summary statistics

They will be available from

The diagram is based on circos in the named directory circos which highlights pQTLs [causal genes]; the significance levels of association can be seen from the inner scatter plot whose ceiling for -log10(P) is set to be 150.

References

.

Footnotes

  1. The SCALLOP consortium. Jing Hua Zhao, David Stacey, Niclas Eriksson, Erin Macdonald-Dunlop, Asa H Hedman, Anette Kalnapenkis, Stefan Enroth, Domenico Cozzetto, Jonathan Digby-Bell, Jonanthan Marten, Lasse Folkersen, Christian Herder, Lina Jonsson, Sarah E. Bergen, Christian Gieger, Elise J Needham, Praveen Surendran, Estonia Biobank Research Team, Dirk S Paul, Ozren Polasek, Barbara Thorand, Harald Grallert, Michael Roden, Urmo Vosa, Tonu Esko, Caroline Hayward, Asa Johansson, Ulf Gyllensten, Nicholas Powell, Oskar Hansson, Niklas Mattsson-Carlgren, Peter K Joshi, John Danesh, Leonid Padyukov, Lars Klareskog, Mikael Landen, James F Wilson, Agneta Siegbahn, Lars Wallentin, Anders Malarstig, Adam S Butterworth, James E. Peters. Mapping pQTLs of circulating inflammatory proteins identifies drivers of immune-related disease risk and novel therapeutic targets. medRxiv 2023.03.24.23287680; doi: https://doi.org/10.1101/2023.03.24.23287680.

  2. Kamat, M.A. et al. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics 35, 4851-4853 (2019).

  3. Kwan, J.S. et al. Meta-analysis of genome-wide association studies identifies two loci associated with circulating osteoprotegerin levels. Hum Mol Genet 23, 6684-6693 (2014).