This is a simple pipeline for analysing Illumina array data, using the Lumi [1] and Limma [2] packages of Bioconductor. To use it, edit the indicated variables at the top of the lumi.mk makefile, and execute it with 'make -f lumi.mk'. A directory called 'pipeline' will be created and will contain the outputs.
This is not a bead-level analysis, it assumes you have the following:
- Sample probe profile, control probe profile and samples table, all usually exported from BeadStudio.
- An illumina annotation file, e.g. "HumanHT-12_V4_0_R2_15002873_B.txt"
- A tab-delimted experiment file describing your samples and with rows matching columns of the input data.
- A tab-delimited file defining contrasts.
- A set of .gmt format gene set files for differential gene set analysis.
The experiment file is tab-delimited without a column name for sample IDs, like:
age gender
Sample1 25 M
Sample2 30 F
Sample3 22 F
Sample4 12 M
Sample5 50 M
Sample6 70 F
The contrasts file defines contrasts in terms of the variables found in the experiment, like:
variable group1 group2
gender F M
lumi.mk is a makefile which can be used to run this pipeline. It is executed like:
make -f lumi.mk <TARGET>
Where <TARGET>
represents a given target in the makefile.
You can see what the makefile will do before actually running it with:
make -n -f lumi.mk <TARGET>
Makefile targets are:
Run all of the following. The default.
Split the Illumina annotation file into main- and control- probes
Run readIllumina.R to Make a valid lumiBatch object from the inputs (will be serialised to .RDS)
Run lumiExpresso.R to Call lumiExpresso() to perform background correction, normalisation and variance stabilisation (check for LUMI parameters in the makefile to tweak options).
Use extractMatrix.R to derive csv-formatted matrices we can use later for exploratory purposes.
Run arrayLimma.R to look at the specified contrasts using limma and produce matrices of uncorrected and corrected p values.
Employ limma's mroast() method to perform differential gene set analysis.
Using makeShiny.R, take the text-format outputs and make a data structure for use with shinyngs. This will be serialised to data.rds, and can the be loaded for visualisation:
eselist <- readRDS('data.rds')
app <- prepareApp('illuminaaarray', eselist)
shiny::shinyApp(ui = app$ui, server = app$server)
- [1] Du, P., Kibbe, W.A., Lin and S.M. (2008). “lumi: a pipeline for processing Illumina microarray.” Bioinformatics.
P D, X Z, CC H, N J, WA K, L H and SM L (2010). “Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis.” BMC Bioinformatics.
Lin, S.M., Du, P., Kibbe and W.A. (2008). “Model-based Variance-stabilizing Transformation for Illumina Microarray Data.” Nucleic Acids Res.
Du, P., Kibbe, W.A., Lin and S.M. (2007). “nuID: A universal naming schema of oligonucleotides for Illumina, Affymetrix, and other microarrays.” Biology Direct.
- [2] Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W and Smyth GK (2015). “limma powers differential expression analyses for RNA-sequencing and microarray studies.” Nucleic Acids Research, 43(7), pp. e47.
- [3] >Huber, W., Carey, J. V, Gentleman, R., Anders, S., Carlson, M., Carvalho, S. B, Bravo, C. H, Davis, S., Gatto, L., Girke, T., Gottardo, R., Hahne, F., Hansen, D. K, Irizarry, A. R, Lawrence, M., Love, I. M, MacDonald, J., Obenchain, V., Ole's, K. A, Pag'es, H., Reyes, A., Shannon, P., Smyth, K. G, Tenenbaum, D., Waldron, L., Morgan and M. (2015). “Orchestrating high-throughput genomic analysis with Bioconductor.” Nature Methods, 12(2), pp. 115–121. <a href="http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html\">http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html.