# Importing the data set

The original experiment is listed on the Gene Expression Omnibus as GSE848; however, this tutorial only uses a subset of the original experiment and should be downloaded from the Partek website tutorial page, [Gene Expression Analysis with Batch Effects](http://s3.amazonaws.com/partekmedia/tutorials/microarray/Breast-Cancer-GE.zip).

* Download the zipped project folder, *Breast\_Cancer-GE.zip*
* Unzip the project folder to *C:/Partek Training Data/* or a directory of your choosing

This location should be easily accessible. The unzipped *Breast\_Cancer-GE* project folder and a zipped annotation file will be added to the selected directory.

* Unzip the included annotation file, *HG\_U95Av2.na32.annot.rar*
* Move the annotation file, *HG\_U95Av2.na32.annot*, to the microarray libraries folder

By default, the microarray libraries folder will be located at *C:/Microarray Libraries*, but the location may vary depending on your operating system and configuration.

* Open Partek Genomics Suite
* Select (![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-7cf0ec06b0f22664b4179fdfdaf684871c1b6c97%2Fimage2017-8-23%2014_23_33.png?alt=media)) from the main command bar
* Navigate to the tutorial folder, *Breast\_Cancer-GE*
* Select *Breast\_Cancer.txt*
* Select **Open** (Figure 1)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-c4a2d0a5c6cab71d26ed1140af83d99d6b7f3ccf%2F2017-08-23%2013_48_35-Open%20Datafile.png?alt=media)

Figure 1. Opening a data file. The red Partek Genomics Suite icon is shown next to the data file (FMT file format)

The spreadsheet will open as *1 (Breast\_Cancer.txt)* (Figure 2).

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-412df62d9a09fe173911d859ee87d57889c52256%2F2017-08-23%2014_26_37-Partek%20Genomics%20Suite%20-%201%20\(Breast_Cancer.txt\).png?alt=media)

Figure 2. Breast\_Cancer.txt data file

The summary at the bottom the spreadsheet shows there are 18 rows and 12,631 columns in the spreadsheet. The first column contains the *Filename* listing the GEO GSM number. This is also is an identifier for the microarray. *Treatment*, *Time*, and *Batch* are in columns 2, 3, and 4, respectively. Column 6 marks the beginning of the probesets. The data is log2 transformed.

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.
