# Detecting differential expression in RNA-Seq data

During import, you created a categorical attribute called *Tissue* and assigned the 4 samples to either the *muscle* or *not muscle* groups. This step was to create replicates within a group, albeit this grouping is somewhat artificial and is only used in this tutorial because we want to illustrate ANOVA with a small data set. Replicates are a prerequisite for differential expression analysis using ANOVA.

* Select **Differential Expression Analysis** from the *Analyze Known Genes* section of the *RNA-Seq* workflow

The *Differential Expression Analysis* dialog offers the choice of analyzing at Gene-,Transcript-, or Exon-level.

* Select **Gene-level**
* Specify the **1/gene\_rpkm (RNA-Seq\_results.gene.rpkm)** spreadsheet from the *Spreadsheet* drop-down menu (Figure 1)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-675de43a53ea5c205189aae68ccd46bc066b51bb%2Ffig24.png?alt=media)

Figure 1. Choosing the type of differential expression analysis

* Select **OK** to open the *ANOVA* dialog

Available factors are listed in the *Experimental Factor(s)* panel on the left-hand side of the dialog.

* Select **Tissue**, then select **Add Factor >** to move **Tissue** to the *ANOVA Factor(s)* panel on the right-hand side of the dialog (Figure 2)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-3c1474d862a1fb4f3afe661d436ede4761250c43%2F2017-08-07%2015_08_14-ANOVA%20of%20Spreadsheet%201_gene_rpkm.png?alt=media)

Figure 2. The ANOVA dialog

If the ANOVA were now performed (without contrasts), a p-value for differential expression would be calculated, but it would only indicate if there are differences within the factor *Tissue*; it would not inform you which groups are different or give any information on the magnitude of the difference between groups (fold-change or ratio). To get this more specific information, you need to define linear contrasts.

* Select **Contrasts...** to open the *Configure* dialog
* For *Select Factor/Interaction*, **Tissue** will be the only factor available as it was the only factor included in the ANOVA model in the previous step; if multiple factors were included, they could be selected in the *Select Factor/Interaction:* drop-down menu. The levels in this factor are listed on the *Candidate Level(s)* panel on the left side of the dialog
* For this data set, verify that **No** is selected for *Data is already log transformed?*
* Left click to select **muscle** from the *Candidate Level(s)* panel and move it to the *Group 1* panel (renamed *muscle*) by selecting **Add Contrast Level** **>** in the top half of the dialog. *Label 1* will be changed to the subgroup name automatically, but you can also manually specify the label name
* Select **not muscle** from the *Candidate Level(s)* panel and move it to the *Group 2* panel (renamed *not muscle)*
* The **Add Contrast** button can now be selected (Figure 3)
* Select **OK** to return to the *ANOVA* dialog

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-09919b26bd10cb268baea94da814bbb16c86ea19%2Ffig26.png?alt=media)

Figure 3. Defining linear contrasts

* Select **OK** to perform the ANOVA as configured (Figure 4)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-3ff779d5ba72e0e4ec190ca02b5f7e10c9e1a130%2F2017-08-07%2015_10_59-ANOVA%20of%20Spreadsheet%201_gene_rpkm.png?alt=media)

Figure 4. Fully configured ANOVA

Once the ANOVA has been performed on each gene in the data set, an ANOVA child spreadsheet *ANOVA-1way (ANOVAResults)* will appear under the *gene\_rpkm* spreadsheet (Figure 5). The format of the ANOVA spreadsheet is similar for all workflows. Mouse over each column title for a description of the column contents.

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-61a2b90bbdc466b432af54241918261a029c291f%2F2017-08-07%2015_58_10-Partek%20Genomics%20Suite%20-%201_gene_rpkm_ANOVA-1way%20\(ANOVAResults\).png?alt=media)

Figure 5. Viewing ANOVA results

In this tutorial, the overall p-value for the factor (column 4) is the same as the p-value for the linear contrast (column 5) as there are only two levels within *Tissue*. If we had more than two groups, the overall p-value and the linear contrast p-values would most likely differ. You can also see the ? symbol in the ratio/fold-change columns (6 and 7) for several genes that also have a low p-value because there are zero reads in one of the groups, thus making it impossible to calculate ratios and fold-changes between groups.

For using ANOVA with more complicated experimental designs, including multiple factors and linear contrasts, please refer to [Identifying differentially expressed genes using ANOVA](https://help.partek.illumina.com/partek-genomics-suite/tutorials/gene-expression-analysis/identifying-differentially-expressed-genes-using-anova) in the Gene Expression Analysis tutorial.

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.
