> For the complete documentation index, see [llms.txt](https://help.partek.illumina.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.partek.illumina.com/partek-flow/user-manual/task-menu/biological-interpretation/gene-set-enrichment.md).

# Gene Set Enrichment

* [What is Gene set enrichment?](#what-is-gene-set-enrichment)
* [Running Gene set Enrichment](#running-gene-set-enrichment)
* [Task report](#task-report)
* [Interactive KEGG pathway maps](#interactive-kegg-pathway-maps)
  * [Coloring the map](#coloring-the-map)
  * [Feature details](#feature-details)
* [Visualizing gene set enrichment results](#visualizing-gene-set-enrichment-results)
* [References](#references)

***

## What is Gene set enrichment?

Enrichment analysis is a technique commonly used to add biological context to a list of genes, such as list of significant genes filtered from differential analysis report. The procedure is based on assigning genes to groups and then finding overrepresented groups in filtered gene lists using a Fisher's exact test.

## Running Gene set Enrichment

Gene set enrichment task can be invoked on a differential analysis output (or filtered differential analysis output) data node or filtered count matrix data node. Since the data node including all the features will serve as background, to get a meaningful result, always use a data node containing subset of features to invoke this task. Only gene names will be used in the computation.

* Click a **Feature list** data node
* Click the **Biological interpretation** section of the toolbox
* Click **Gene set enrichment**
* There are two options for *Database*. **KEGG database** requires a special license (Figure 1).

<figure><img src="/files/tmBFaJHwRws7KPXBoofe" alt=""><figcaption><p>Figure 1. Selecting KEGG database</p></figcaption></figure>

* **Gene set database** is user defined database, see more details in the [Adding a Gene Set](/partek-flow/user-manual/settings/components/library-file-management/adding-a-gene-set.md) chapter. The gene sets available for the current *Assembly* are listed under the *Gene set* *database* drop-down list (Figure 2). The assembly is automatically selected, if possible. If the assembly cannot be detected, you can specify it using the drop-down.

Partek distributes Gene Ontology (GO) for human and mouse genomes, a bioinformatics initiative to unify the representation of gene and gene product attributes across various species \[1, 2].

<figure><img src="/files/q5om3XRMTKM2VGhXjT87" alt=""><figcaption><p>Figure 2. Selecting user defined gene set database</p></figcaption></figure>

* **Select feature identifier** (optional) can be used to specify the feature format (e.g. Gene name, Gene ID, Feature ID).
* **Specify the background gene list** (optional) can be used for a feature list. Select the list using the drop-down. [Click here for more information on List management](/partek-flow/user-manual/settings/components/lists.md).

The background gene list is used as the list of possible genes. By default, this is the genes included in the selected gene set database. If your assay limits the genes that could be detected, you may want to specify a background list.

* Click **Finish** to run

The result is stored under an *Enrichment task* node. To open it, **double click** on the node or select the respective **Task report** from the context sensitive menu.

## Task report

Figure 3 shows an example *Gene set enrichment* task report using GO database. The table contains one gene set per row (*Gene set* column; the column entries are hyperlinks when using the distributed GO gene sets), with the category name in the *Description* column. The categories are ranked by the *Enrichment score*, which is the negative natural logarithm of the enrichment p-value (*P-value* column) derived from Fisher's exact test on the underlying contingency table. The higher the enrichment score, the more overrepresented the GO category is within the input list of significant genes. The columns can be searched by typing in the search term in the respective box (and hitting **Enter**), or sorted by selecting the **double arrow** icon ( ![](/files/I99hFKc5ONQpq7if86zc) ).

<figure><img src="/files/zXORSZ6yeCGfMnZhwpt7" alt=""><figcaption><p>Figure 3. Go enrichment report (truncated). Gene set column contains Gene Ontology identifiers (hyperlinks). Category labels are in the Description column. Enrichment score: negative natural logarithm of the enrichment P-value derived from the Fisher's exact test. Genes in list: number of genes that are present both in the list of significant genes and the gene set (GO category). Genes not in list: number of genes that are present in the gene set, but are not present in the list of significant genes. The column on the right contains links to gene breakdown chart and extra details</p></figcaption></figure>

The contingency table (Figure 4) can be displayed by selecting the **View gene breakdown chart** icon on the right (![](/files/6QfN4mf9ETPpQkYll8I8)). The term "list" refers to the list of significant genes, while the term "set" refers to the respective GO category. The first row of the contingency table is also seen in the report, namely the *Genes in list* and *Genes not in list* columns.

<figure><img src="/files/NVegvlgd9LQGiU0elDjH" alt=""><figcaption><p>Figure 4. Contingency table used to calculate the enrichment p-value. List refers to the list of significant genes, set refers to the gene ontology category</p></figcaption></figure>

The **View extra details** (![](/files/Ye51zl9mEITABZQ0YB3Z)) button provides additional information on the GO category (Figure 5). In addition to the details already given in the report, a full list of *Genes in list* and *Genes not in list* can be inspected and downloaded (**Download data**) to the local computer as a text file. Use the arrow to expand these sections.

<figure><img src="/files/nSzXn0bHGVpPd0S6EOrn" alt=""><figcaption><p>Figure 5. Gene ontology enrichment extra details</p></figcaption></figure>

As previously mentioned, if you are using the GO gene sets distributed by Partek, the GO identifiers in the first column are hyperlinks to the Gene Ontology web-site entries (an example shown in Figure 6).

<figure><img src="/files/b9mXJGlNZvjQzUgVdUgI" alt=""><figcaption><p>Figure 6. Selecting a GO category in the table report opens up a browser and displays additional information on that category via GO web-page</p></figcaption></figure>

## Interactive KEGG pathway maps

When KEGG database is used, on the enrichment task report, when click on a pathway ID in the Gene set column, a KEGG pathway gene network picture is displayed (Figure 7).

<figure><img src="/files/htdYn0XVv2OmRrPslqNq" alt=""><figcaption><p>Figure 7. KEGG pathway map</p></figcaption></figure>

Each rectangle on the map represents a gene product in the pathway. Gene products are mostly proteins coded by a gene or group of genes, but they could be RNA too. Related pathways are shown as large rounded rectangles. Chemical compounds, DNA or other molecules are shown as circles.

### Coloring the map

The pathway map is colored by the first fold-change column in the input *Feature list* data node. The control panel on the left can be used to configure the colors of the pathway map. In all options, rectangles colored white do not have gene information. Options for coloring include:

* Fixed color: all genes are colored black.
* Genes in list: all genes in the list are colored, default color is yellow, but this can be configured. Genes not in the list are black.
* Statistics in the gene list: .e.g FDR, p-value, Fold change etc. Colors can be customized by clicking on the color square to change.

### Feature details

Mousing over a rectangle shows the genes indicated by the rectangle in the tooltip (Figure 8). Genes are listed on rows with all aliases in the KEGG database included on the row. Genes that are in the list and used to color the rectangle is shown in bold.

<figure><img src="/files/HCW5RTTcH7Gy7ETqq3Ll" alt=""><figcaption><p>Figure 8. Checking genes represented by a rectangle</p></figcaption></figure>

On KEGG pathway maps that include chemical compounds, the chemical structure is shown in the tooltip on mouse-over (Figure 9).

<figure><img src="/files/kGDdhsWtKkrA5XWT8PRK" alt=""><figcaption><p>Figure 9. Viewing chemical compound</p></figcaption></figure>

Clicking a rectangle opens the page for that gene or group of genes on the KEGG website in a new tab in your web browser.

Click the Save image ![](/files/91YoOwWP9lJ13EMZfGkJ) icon to download a PNG file showing the configured KEGG pathway map to your local computer.

## Visualizing gene set enrichment results

If the gene set enrichment table has fewer than 100 results (rows), the categories can be visualized in the *Data Viewer*. Otherwise, a notification is displayed in the top left corner (Figure 10).

<figure><img src="/files/kXRPjSUddj5GWtzov6bX" alt=""><figcaption><p>Figure 10. If the table has more than 100 rows, visualization of results is not possible</p></figcaption></figure>

If needed, filter down the number results, for instance by using a cut-off based on the enrichment score. Type in the cut-off value in the text box beneath the *Enrichment score* and hit **enter** (an example is shown in Figure 11). Once the number or results falls below 100, a link to the *Data Viewer* will be displayed (Figure 8). Click on the **View plots in Data Viewer** link to open a new *Data Viewer* session.

<figure><img src="/files/RWKYmNYV3Zuf9omqUOtE" alt=""><figcaption><p>Figure 11. Use the View plots in Data Viewer link to visualize the enrichment results. The link is not visible if the table contains more than 100 rows</p></figcaption></figure>

Two plots are loaded into Data Viewer (Figure 12). Both plots show enrichment score on the horizontal axis and gene ontology categories (i.e. the ones present in the gene enrichment table) on the vertical axis. The plots show enrichments scores (*Enrichment score* column of the gene ontology table) and - in addition - the plot on the left uses color range to depict enrichment *P-value* (green = low, red = high P-value).

The same functionality is available for pathway enrichment results.

<figure><img src="/files/ARSEa1Mj5lciPCtMq24N" alt=""><figcaption><p>Figure 12. Visualizing gene ontology results. Vertical axis shows the gene ontology categories present in the underlying gene ontology table</p></figcaption></figure>

## References

1. Ashburner M, Ball CA, Blake JA et al. Gene Ontology: tool for the unification of biology. *Nat Genetics.* 2000; 25:25-29.
2. The Gene Ontology Consortium. Gene Ontology Consortium: going forward. *Nucleic Acids Res*. 2015; 43:D1049-1056.Recommended citations from the Geneontology.org website

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://help.partek.illumina.com/partek-flow/user-manual/task-menu/biological-interpretation/gene-set-enrichment.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.