Analyzing Illumina Infinium Methylation array data
Last updated
Last updated
This guide provides instructions for analyzing Illumina Infinium Methylation array data.
The tutorial uses this Infinium Methylation Screening Array Demo Data Set if following along exactly with the analyses pipeline.
If you are new to Partek Flow, please see the Quick start guide for information about the Partek Flow user interface.
We recommend uploading the microarray data to a folder on your Partek Flow server before importing into a project. Data files can be transferred to your server from the Home page by clicking the Transfer file button. Users have the option to change the Upload directory by clicking the Browse button and either select another existing directory or create a new directory. Please click here for more information on transferring files to the server.
To create a new project on the Home page click the +Add data button, enter a project name, and click Create project.
Select Microarray, Methylation and Illumina methylation idat as the file format for import then click Next.
Navigate to the idat files that have been uploaded to the server. For this tutorial, there are two paired idat files per sample.
If you have not already transferred the files to the server you can choose to do this within the import task by clicking the Transfer files to the server button.
This will bring you to the Transfer files page. Click the Transfer files button, add the files for transfer then click Upload. Do not terminate the browser or let your computer go to sleep during transfer. A time estimate for upload is provided but may change.
When the transfer completes the upload window will close. In the Transfer files page, the transferred files Status is Complete.
Now, the selected files are on the server in the folder specified during transfer.
In the project import task, navigate to the files on your server. In the example below, only these files were saved to this folder so I will check the top box to select all then click Finish.
This starts the Importing of selected data to the project. The transparent task bar will complete as import progresses.
When the import completes, the Microarray methylation data node appears in the Analyses tab. Hovering of this node, we see 60 samples (572.28 MB data) are contained in this data node.
Add sample metadata to the project by navigating to the Metadata tab. Select the Assign values from file button as an efficient way to assign sample attributes using a tab delimited text file.
If the samples metadata file is not already on the server, click Transfer files to the server.
Select the file then click Next.
The tab delimited file should contain a table with the following:
The first column of the table lists the sample names (the sample names in the file must be identical to the ones listed in the project Sample name column in the Metadata tab)
The first row lists the attribute names (e.g. Treatment, Exposure)
List any corresponding attributes for each sample in succeeding columns
Make any wanted modifications and click Import.
This adds the defined attribute information to the Metadata tab. Manage and Assign values buttons can be used to further modify sample attributes.
Click the left Analyses tab to navigate back to the analyses pipeline.
Single-click the Microarray methylation data node to run the first task using the context sensitive menu on the right. No tasks have been performed on this data so there is still an option to Add data to the project; once analysis tasks are performed from this data node, data can no longer be added to the project.
Using the task menu on the right under Methylation analysis select the Generate beta value task.
Choose the Chip name as Infinium Methylation Screening Array, If the Chip name is not listed use the dropdown to select New Chip then add the Illumina manifest file.
Keep the default settings and click Finish.
This task output is the Methylation beta data node. Methylation Beta-values are continuous variables between 0 and 1.
Single-click the Methylation beta data node node and choose the next PCA task under Exploratory analysis from the task menu.
In the Analyses tab, select the Methylation beta data node then use the task menu Exploratory analyses options to run the PCA task . Keep the default settings the same and click Finish.
In the Analyses tab, double click the PCA data node (circle) or single click the data node and select Task report under Task results in the task menu to view the PCA results in the Data Viewer.
Single-click the Methylation beta data node node and perform the Detect differential methylation task under Methylation analysis in the task menu.
This task converts the Beta-values to M-values and uses these to perform ANOVA differential expression analysis. Please click here for more information on the ANOVA model.
Follow along with the task to make one-way or two-way ANOVA comparisons. The configured ANOVA model is performed on both Beta-value and M-value matrices. Click Finish.
This outputs the Detect differential methylation task report list. Open the Task report from the task menu or double-click the data node.
The outputs of this task include significance as P-value and FDR step up which is from the M-values. The LSMeans of the groups and the Difference are of the Beta-values.
Click the Optional columns button to add more column data including annotation from the Illumina manifest file.
Use the left filter panel to filter the results then click Generate filtered node.
The filtered node is now available in the Analyses pipeline.
Select the generated Filtered feature list data node and use the Exploratory analysis task menu dropdown to perform the Hierarchical clustering / heatmap task.
In the Hierarchical clustering / heatmap task settings, change Sample order to Assign order, select the attribute, then click Finish. This will order the heatmap rows based on the attribute order assigned.
Completion of this task will output the Hierarchical clustering / heatmap task results. Double-click this node or select Task report under Task results from the task menu to open the results in the Data viewer.
The heatmap visualization can be altered within the Data viewer using both the left menu and the in-plot controls.
Please click here for more information on using the Data viewer to modify visualizations.
Click Save as in the left menu to save the data viewer session to return and make changes.
Select the Filtered feature list data node within the Analyses tab and choose the Gene set enrichment task under Biological interpretation in the task menu.
Use the Gene set enrichment task settings to change default parameters.
Choose the KEGG database. Click the dropdown to change the selection; select New library to add the most recent library available.
Check Select feature identifier as Gene Symbol or Gene IDs to ensure genes are used instead of probe IDs for this task.
After optimizing the task settings, click Finish.
The output of the Gene set enrichment task node is the Pathway enrichment data node.
Filter the KEGG pathway gene sets by selecting the Pathway enrichment data node then clicking the Filter gene sets task from the task menu.
Modify the Filter by parameters to include P-value < 0.050 then click Finish.
Completion of this task will output a Filter gene sets task node and a Filtered list data node in the Analyses tab. Open the filtered gene sets by selecting the Filtered list data node and clicking the task menu Task report from the right task menu.
Because we filtered the gene sets to fewer than 100 rows, click the button to View plots in the Data Viewer. The filtering step can also be performed within the Pathway enrichment report.
This opens the filtered list report in the Data viewer for further modification.
Please click here for more information on using the Data viewer to modify visualizations.
Save this data viewer session, with a meaningful name to revisit for future analysis, using the Save as button in the left menu.
This saved session is accessible by selecting the project Data viewer tab or by clicking Data Viewer within the breadcrumb (shown above the Data viewer canvas).
Please click here for more information on the Gene set enrichment task like the interactive KEGG pathway maps.
This completes the example analyses pipeline.
Upon project creation you will land in the Analyses tab, prompting the addition of sample data to the project. Click the blue Add data button
When available, hover over Tooltips or click the video help for decision making.
To save the full individual image within the Data viewer to your machine, click the in-plot Export image icon in the top right corner of the image, choose All data then select the format, size, and resolution and click Save.