All pages
Powered by GitBook
1 of 13

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Single Node Installation

This document describes how to set up and configure a single-node Partek Flow license.

  • Docker Compose Deployment

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Installation Guide

Partek Flow is a web-based application for genomic data analysis and visualization. It can be installed on a desktop computer, computer cluster or cloud. Users can then access Partek Flow from any browser-enabled device, such as a personal computer, tablet or smartphone.

Read on to learn about the following installation topics:

  • Minimum System Requirements

  • Single Cell Toolkit System Requirements

  • Single Node Installation

Additional Assistance

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

Minimum System Requirements

Web Browser Requirements

Regardless of whether Partek Flow is installed on a server or the cloud, users will be interacting with the software using a web browser. We support the latest Google Chrome, Mozilla Firefox, Microsoft Edge and Apple Safari browsers. While we make an effort to ensure that Partek Flow is robust, note that some browser plugins may affect the way the software is viewed on your browser.

Single Node Amazon Web Services‎ Deployment
Multi-Node Cluster Installation
Creating Restricted User Folders within the Partek Flow Server
Updating Partek Flow
Uninstalling Partek Flow
Dependencies
Docker and Docker-compose
Java KeyStore and Certificates
Kubernetes
our support page

Hardware Requirements (Single-node Linux)

  • Hardware Requirements (Cluster or Cloud)

  • Storage Recommendations

  • Hardware Requirements (Single-node Linux)

    If you are installing Partek Flow on your own single-node server, we require the following for successful installation:

    • Linux: Ubuntu® 18.04, Redhat® 8, CentOS® 8 or later versions of these distributions

    • 64-bit 2GHz quad-core processor1

    • 48GB of RAM2

    • > 2TB of storage available for data

    • > 100GB on the root partition

    • A broadband internet connection

    We support Docker-based installations. Please contact [email protected] for more information.

    1Note that some analyses have higher system requirements for example to run the STAR aligner on a reference genome of size ~3 GB (such as human, mouse or rat), 16 cores are required.

    2Input sample file size can also impact memory usage, which is particularly the case for TopHat alignments.

    Increasing hardware resources (cores, RAM, disk space, and speed) will allow for faster processing of more samples.

    If you are licensed for the Single Cell Toolkit, please see Single Cell Toolkit System Requirements for amended hardware requirements.

    Hardware Requirements (Cluster or Cloud)

    Please contact Partek Technical Support if you would like to install Partek Flow on your own HPC or cloud account. We will assist in assessing your hardware needs and can make recommendations regarding provisioning sufficient resources to run the software.

    Storage Recommendations

    Proper storage planning is necessary to avoid future frustration and costly maintenance. Here are several DO's and DO NOT's:

    DO:

    • Plan for at least 3 to 5 times more storage than you think is necessary. Investing extra funds in storage and storage performance is always worth it.

    • Keep all Flow data on a single partition that is expandable, such as RAID or LVM.

    • Back up your data, especially the Partek Flow database.

    DO NOT:

    • Store data on 'removable' USB drives. Partek Flow will not be able to see these drives.

    • Store data across multiple partitions or folder locations. This will increase the maintenance burden substantially.

    • Use non-Linux file systems like NTFS.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Web Browser Requirements

    Uninstalling Partek Flow

    Linux

    Open a terminal window and enter the following command.

    Debian/Ubuntu:

    RedHat/Fedora/CentOS:

    The uninstall removes binaries only (/opt/partek_flow). The logs, database (partek_db) and files in the home/flow/.partekflow folder will remain unaffected.

    MacOS
    1. Stop and quit Partek Flow using the Partek Flow app in the menu.

    2. Using Finder, delete Flow application from the Applications menu.

    Missing image &#xNAN;Figure 1. Control of Partek Flow through the menu bar

    This process does not delete data or the library files. Users who wish to delete those can delete them using Finder or terminal. The default location of project output files and library files is the /FlowData directory under the user's home folder. However, the actual location may vary depending on your System or Project specific settings.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    $ sudo apt-get remove partekflow

    Updating Partek Flow

    Before performing updates, we recommend Backing Up the Database.

    For tomcat build update, download the latest version from below:

    wget --content-disposition packages.partek.com/linux/flow

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    $ sudo yum remove partekflow

    Java KeyStore and Certificates

    • Java Keystore

    • Adding a certificate to the KeyStore

    Java Keystore

    JKS or Java KeyStore is used in Flow for some very specific scenarios where encryption is involved and there is a need for asymmetric encryption.

    Partek Flow is shipped with a Java Keystore on its own, the file is found at .../partek_flow/distrib/flowkeystore where you may want to add your public and private certificates.

    Adding a certificate to the KeyStore

    If you already have a certificate please skip to the next step.

    Create a certificate

    Please place the key in a secure folder. (it is advisable to place in Flow's home directory. eg. /home/flow/keys

    These commands above are meant to be used in a terminal. There are other ways to help you make a certificate but they will not going to be mentioned here.

    If you wish to understand the flags used above please refer to the OpenSSL documentation.

    Import a certificate into flowkeystore

    For this step you will have to find where the cacerts file is located, it is under the Java installation, if you do not know how to do it contact us and we can help.

    In the example the cacerts file is located at /usr/lib/jvm/java-11-openjdk-amd64/lib/security/cacerts

    Tell the JVM where to find the key

    We need to tell Partek Flow where the key is located, to do this we will edit a file which contains some of the Flow settings.

    The file is usually located at /etc/partekflow.conf if you do not have this file we would advise to use the bashrc file from the system user that runs Partek Flow.

    At the end of that file please add:

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Single Cell Toolkit System Requirements

    • Up to 100,000 cells per analysis

    • More than 100,000 cells per analysis

    Because of the large size of single cell RNA-Seq data sets and the computationally-intensive tools used in single cell analysis, we have amended our system requirements and recommendations for installations of Partek Flow with the Single Cell toolkit.

    Up to 100,000 cells per analysis

    Required

    • Linux: Ubuntu® 18.04, Redhat® 8, CentOS® 8, or newer

    • CPU: 64-bit 2 GHz quad-core processor

    • Memory: 64 GB of RAM

    • Local scratch space*: 1 TB with cached or native speeds of 2GB/s or higher

    Recommended

    • Linux: Ubuntu® 18.04, Redhat® 8, CentOS® 8, or newer

    • CPU: 64-bit 2 GHz quad-core processor

    • Memory: 128 GB of RAM

    • Local scratch space1: 2 TB with cached or native speeds of 2GB/s or higher

    More than 100,000 cells per analysis

    Required

    • Linux: Ubuntu® 18.04, Redhat® 8, CentOS® 8, or newer

    • CPU: 64-bit 2 GHz quad-core processor

    • Memory: 256 GB of RAM

    • Local scratch space1: 2 TB with cached or native speeds of 2GB/s or higher

    Recommended

    • Linux: Ubuntu® 18.04, Redhat® 8, CentOS® 8, or newer

    • CPU: 64-bit 2 GHz quad-core processor

    • Memory: 512 GB of RAM

    • Local scratch space1: 10 TB with cached or native speeds of 2GB/s or higher

    For fastest performance:

    • Newer generation CPU cores with avx2 or avx-512 are recommended.

    • Performance scales proportionality to the number of CPU cores available.

    • Hyper thread cores (threads) scales performance for most operations other than principal component analysis.

    *Contact Partek support for recommended setup of local scratch storage

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Storage: > 2 TB available for data and > 100 GB on the root partition

    Storage: > 2 TB available for data and > 100 GB on the root partition

    Storage: > 4 TB available for data

    Storage: 10 TB available for data

    our support page
    our support page
    [~] openssl genrsa -out flow.key 2048
    [~] openssl ecparam -genkey -name secp384r1 -out flow.key
    [~] openssl req -new -x509 -sha256 -key flow.key -out flow.crt -days 3650
    [~] keytool -import -file /home/flow/.partekflow/keys/flow.key -alias someName -keystore /usr/lib/jvm/java-11-openjdk-amd64/lib/security/cacerts -storepass changeit -noprompt
    export CATALINA_OPTS="$CATALINA_OPTS -Djavax.net.ssl.trustStore=${HOME}/keys"

    Creating Restricted User Folders within the Partek Flow server

    Partek Flow provides the infrastructure to isolate data from different users within the same server. This guide will provide general instructions on how to create this environment within Partek Flow. This can be modified to accommodate existing file systems already accessible to the server.

    Go to Settings > Directory permissions and restrict parent folder access (typically /home/flow) to Administrator accounts only

    Figure 1. Setting directory permission for administrators

    Click the Permit access to a new directory button and navigate to the folder with your library files (typically /home/flow/FlowData/library_files). Select the All users (automatically updated) checkbox to permit all users (including those that will be added in the future) to see the library files associated with the Partek Flow server

    Figure 2. Allow all users permission to see the library files

    Then go to System preferences > Filesystem and storage and set the Default project output directory to "Sample file directory"

    Figure 3. Set default project output directory

    Create your first user and select the Private directory checkbox. Specify where the private directory for that user is located

    If needed, you can create a user directory by clicking Browse > Create new folder

    This automatically sets browsing permissions for that private directory to that user

    When a user creates a project. The default project output directory is now within their own restricted folder

    More importantly, other users cannot see them

    Add additional users as needed

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Docker and Docker-compose

    • Docker

    • Useful commands

    • Docker-compose

    Docker

    Docker can be used along Partek Flow to deploy an easy to maintain environment which will not have dependency issues and will be easy to relocate among different servers if needed.

    One can follow the to install Docker.

    Useful commands

    This command will output the details of the currently running containers including port forwarding, container name/id, and uptime.

    This command will allow us to enter the running container’s environment to troubleshoot any issues we might have with the container.

    Docker-compose

    “Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.“

    Partek support will work with the customer to customize a docker-compose file that will have all the configuration necessary to run Partek Flow on any machine that meets our .

    Below it is an example docker-compose.yml file which can be used to deploy Partek Flow.

    These are some of the important tags shown above:

    • restart: whether you want the container to be restarted automatically upon failure or system restart.

    • image: the container image tag. It is recommended to run the latest version of Partek Flow. If you need any specific versions of Partek Flow please visit here.

    • environment: here we set up any environment variables to be run along the container.

    • port: the exposed port used to access Partek Flow via a web browser.

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Multi-Node Cluster Installation

    Partek Flow is a genomics data analysis and visualization software product designed to run on compute clusters. The following instructions assume the most basic setup of Partek Flow and must only be attempted by system administrators who are familiar with Linux-based commands. These instructions are not intended to be comprehensive. Cluster environments are largely variable, thus there are no 'one size fits all' instructions. The installation procedure on a computer cluster is highly dependent on the type of computer cluster and the environment it is located. We can to support a large array of Linux distributions and configurations. In all cases, Partek Technical Support will be available to assist with cluster installation and maintenance to ensure compatibility with any cluster environment. Please consult with Partek Licensing Support ([email protected]) for additional information.

    Prior to installation, make sure you have the license key related to the host-ID of the compute cluster the software will be installed in. Contact [email protected] for key generation.

    our support page
    Figure 4. Adding a user with a private directory
    Figure 5. Create a new private user folder
    Figure 6. Private directories automatically get restricted permissions
    Figure 7. Project output directory will now be within private directory
    Figure 8. Other users' directories are not visible
  • mac_address: this needs to match your license file.

  • volumes: in this section we specify the folder on the server to be shared with the container, this way we can better persist and access the files we create and use in the container.

  • Docker documentation
    Documentation
    Minimum system requirements
    our support page
    docker ps
    docker exec
    services:
      flowheadnode:
        restart: unless-stopped
        image: public.ecr.aws/partek-flow/rtw:latest
        hostname: flowheadnode
        environment:
          # Set license location. Can be a server ex. @lic-test.example.com or /home/flow/.partekflow/license/Partek.lic
          - PARTEKLM_LICENSE_FILE=/home/flow/.partekflow/license/Partek.lic
        ports:
          - "8080:8080"
        # The MAC must match what is in the license file or defined on the license server. Please change this.
        networks:
          flexlm:
            mac_address: aa:bb:cc:dd:ee:ff
        # The internal path must be /home/flow
        volumes:
          # This uses an external path for the Flow data. The 'flow' user inside the container must be able to read and write to this directory.
          - /home/flow:/home/flow
    networks:
      flexlm:
    Integration with your queueing system
  • Bringing up workers

  • Shutting down workers

  • Updating Partek Flow

  • Installation on a Computer Cluster

    Make a standard linux user account that will run the Partek Flow server and all associated processes. It is assumed this account is synced between the cluster head node and all compute nodes. For this guide, we name the account flow

    1. Log into the flow account and proceed to the cd to the flow home directory

    1. Download Partek Flow and the remote worker package

    1. Unzip these files into the flow home directory /home/flow. This yields two directories: partek_flow and P_artekFlowRemoteWorker_

    2. Partek Flow can generate large amounts of data, so it needs to be configured to the bulk of this data in the largest shared data store available. For this guide we assume that the directory is located at /shared. Adjust this path accordingly.

    3. It is required that the Partek Flow server (which is running on the head node) and remote workers (which is running on the compute nodes) see identical file system paths for any directory Partek Flow has read or write access to. Thus /shared and /home/flow must be mounted on the Flow server and all compute nodes. Create the directory /shared/FlowData and allow the flow linux account write access to it

    4. It is assumed the head node is attached to at least two separate networks: (1) a public network that allows users to log in to the head node and (2) a private backend network that is used for communication between compute nodes and the head node. Clients connect to the Flow web server on port 8080 so adjust the firewall to allow inbound connections to 8080 over the public network of the head node. Partek Flow will connect to remote workers over your private network on port 2552 and 8443, so make sure those ports are open to the private network on the flow server and workers.

    5. Partek Flow needs to be informed of what private network to use for communication between the server and workers. It is possible that there are several private networks available (gigabit, infiniband, etc.) so select one to use. We recommend using the fastest network available. For this guide, let's assume that private network is 10.1.0.0/16. Locate the headnode hostname that resolves to an address on the 10.1.0.0/16 network. This must resolve to the same address on all compute nodes.

    6. For example:

    host head-node.local yields 10.1.1.200

    Open /home/flow/.bashrc and add this as the last line:

    Source .bashrc so the environment variable CATALINA_OPTS is accessible.

    NOTE: If workers are unable to connect (below), then replace all hostnames with their respective IPs.

    1. Start Partek Flow

    1. You can monitor progress by tailing the log file partek_flow/logs/catalina.out. After a few minutes, the server should be up.

    2. Make sure the correct ports are bound

    1. You should see 10.1.1.200:2552 and :::8080 as LISTENing. Inspect catalina.out for additional error messages.

    2. Open a browser and go to http://localhost:8080 on the head node to configure the Partek Flow server.

    3. Enter the license key provided (Figure 1)

    &#xNAN;Figure 1. Setting up the Partek Flow license during installation

    1. If there appears to be an issue with the license or there is a message about 'no workers attached', then restart Partek Flow. It may take 30 sec for the process to shut down. Make sure the process is terminated before starting the server back up:

    Then run:

    1. You will now be prompted to setup the Partek Flow admin user (Figure 2). Specify the username (admin), password and email address for the administrator account and click Next

    &#xNAN;Figure 2. Setting up the Partek Flow 'admin' account during installation

    1. Select a directory folder to store the library files that will be downloaded or generated by Partek Flow (Figure 3). All Partek Flow users share library files and the size of the library folder can grow significantly. We recommend at least 100GB of free space should be allocated for library files. The free space in the selected library file directory is shown. Click Next to proceed. You can change this directory after installation by changing system preferences. For more information, see Library file management.

    &#xNAN;Figure 3. Selecting the library file directory

    1. To set up the Partek Flow data paths, click on Settings located on the top-right of the Flow server webpage. On the left, click on Directory permissions then Permit access to a new directory. Add /shared/PartekFlow and allow all users access.

    2. Next click on System preferences on the left menu and change data download directory and default project output directory to /shared/PartekFlow/downloads and /shared/PartekFlow/project_output respectively

    Note: If you do not see the /sharedfolder listed, click on the Refresh folder list link that is toward the bottom of the download directory dialog

    1. Since you do not want to run any work on the head node, go to Settings>System preferences>Task queue and job processing and uncheck Start internal worker at Partek Flow server startup.

    2. Restart the Flow server:

    After 30 seconds, run:

    This is needed to disable the internal worker.

    1. Test that remote workers can connect to the Flow server

    2. Log in as the flow user to one of your compute nodes. Assume the hostname is compute-0. Since your home directory is exported to all compute nodes, you should be able to go to /home/flow/PartekFlowRemoteWorker/

    3. To start the remote worker:

    1. These two addresses should both be in the 10.1.0.0/16 address space. The remote worker will output to stdout when you run it. Scan for any errors. You should see the message woot! I'm online.

    2. A successfully connected worker will show up on the Resource management page on the Partek Flow server. This can be reached from the main homepage or by clicking Resource management from the Settings page. Once you have confirmed the worker can connect, kill the remote worker (CTRL-C) from the terminal in which you started it.

    3. Once everything is working, return to library file management and add the genomes/indices required by your research team. If Partek hosts these genomes/indices, these will automatically be downloaded by Partek Flow

    Integration with your queueing system

    1. In effect, all you are doing is submitting the following command as a batch job to bring up remote workers:

    1. The second parameter for this script can be obtained automatically via:

    Bringing up workers

    Bring up workers by running the command below. You only need to run one worker per node:

    Shutting down workers

    Go to the Resource management page and click on the Stop button (red square) next to the worker you wish to shut down. The worker will shut down gracefully, as in it will wait for currently running work on that node to finish, then it will shut down.

    Updating Partek Flow

    For the cluster update, you will get a link of .zip file for Partek Flow and remote Flow worker respectively from Partek support, all of the following actions should be performed as the Linux user that runs Flow. Do NOT run Flow as root.

    1. Go to the Flow installation directory. This is usually the home directory of the Linux user that runs Flow and it should contain a directory named "partek_flow". The location of the Flow install can also be obtained by running ps aux | grep flow and examining the path of the running Flow executable.

    2. Shut down Flow:

    1. Download the new version of Flow and the Flow worker:

    1. Make sure Flow has exited:

    The flow process should no longer be listed.

    1. Unpack the new version of Flow install and backup the old install:

    1. Backup the Flow database folder. This should be located in the home directory of the user that runs Flow.

    1. Start the updated version of Flow:

    (make sure there is nothing of concern in this file when starting up Flow. You can stop the file tailing by typing: CTRL-C)

    You may also want to examine the the main Flow log for errors:

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Installation on a Computer Cluster

    Dependencies

    • Select Library File Directory

    • CNVkit

    • DESeq2

    • HTSeq

    Flow ships with tasks that do not have all of their dependencies included. On startup Flow will attempt to install the dependencies, but not every system is equipped to install them.

    In the case of any difficulties, it is highly recommended to instead use a docker deployment (cluster installations may require singularity instead, which is somewhat still a work-in-progress)Z

    CNVkit

    Requires Python 2.7 or later.

    On startup Flow will attempt to install additional python packages using the command

    Requires R 3.2.3 or later.

    On startup Flow will attempt to install additional R packages.

    There are cascading dependencies, but you can view the core libraries in partek_flow/bin/cnvkit-0.8.5/install.R

    If these packages can't be built locally, it may be possible for the user to download them from us (see below).

    DESeq2

    Requires R 3.0 or later.

    On startup Flow will attempt to install additional R packages.

    There are cascading dependencies, but you can view the core libraries in partek_flow/bin/deseq_two-3.5/install.R

    If these packages can't be built locally, it may be possible for the user to download them from us (see below).

    RcppArmadillo may also have dependencies on multi-threading shared objects that may not be on the LD_LIBRARY_PATH

    The recommendation is to copy those .so files to a folder and make sure it is available from the LD_LIBRARY_PATH when the server/worker starts.

    Additional dynamic libraries (such as libxml2.so) may be missing and we can provide a copy appropriate for the target OS.

    HTSeq

    Requires Python 2.7 or 3.4 or above

    On startup Flow attempts to install using pip

    MACS3

    Requires python 3.0 or above

    Python

    If there are any conflicts with preinstalled python packages, Flow should be configured to run with its own virtual environment:

    or

    R

    R can usually be installed from the package manager. If the user installs Flow via apt or yum it should already be installed.

    For older operating systems R is not available and will need to be installed from

    Currently, we offer a set of R packages compatible with some versions of R

    Extract this file in the home directory. (Make .R a symlink if the home directory doesn't have enough free space)

    These packages include the dependencies for both CNVkit and DESeq2

    When running R diagnostic commands outside flow, it can simplify things if the environment includes a reference to the ~/.R folder:

    or load

    in ~/.Rprofile

    list loaded packages:

    get the version:

    Variant Effect Predictor

    This is a compiled Perl script (so it has no direct dependency on Perl itself) we have had one report (istem.fr) of it failing to run.

    DECoN

    DECoN comes pre-installed in the flow_dna container

    Documentation on installing DECoN is available here:

    DECoN requires R version 3.1.2

    It must be installed under /opt/R-3.1.2 or set the DECON_R environment variable to its folder

    Download DECoN

    and install it under /opt/DECoN or set the DECON_PATH environment variable to its folder

    You may need to add

    to Linux/packrat/packrat.opts

    See also:

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    cd home/flow
    wget --content-disposition http://packages.partek.com/linux/flow
    wget --content-disposition http://packages.partek.com/linux/flow-worker
    export CATALINA_OPTS="$CATALINA_OPTS -Djava.awt.headless=true
    -DflowDispatcher.flow.command.hostname=head-node.local
    -DflowDispatcher.akka.remote.netty.tcp.hostname=head-node.local"
    ~/partek_flow/start_flow.sh
    netstat -tulpn
    ~/partek_flow/stop_flow.sh
    ~/partek_flow/start_flow.sh
    ~/partek_flow/stop_flow.sh
    ~/partek_flow/start_flow.sh
    ./partekFlowRemoteWorker.sh head-node.local compute-0
    /home/flow/PartekFlowRemoteWorker/partekFlowRemoteWorker.sh head-node.local compute-0
    $(hostname -s)
    /home/flow/PartekFlowRemoteWorker/partekFlowRemoteWorker.sh head-node.local compute-0
    ./partek_flow/stop_flow.sh
    wget --content-disposition http://packages.partek.com/linux/flow-release
    wget --content-disposition http://packages.partek.com/linux/flow-worker-release
    ps aux | grep flow
    mv partek_flow partek_flow_prev
    mv PartekFlowRemoteWorker PartekFlowRemoteWorker_prev
    tar -czvf partek-db-bkp-date.tgz ~/.partekflow
    ./partek_flow/start_flow.sh
    tail -f partek_flow/logs/catalina.out
    ~/.partekflow/logs/flow.log
    Setting up the Partek Flow license during installation
    Setting up the Partek Flow admin account during installation
    Selecting the library file directory
    MACS3
    Python
    R
    Variant Effect Predictor
    DECoN
    source
    3.2.3
    3.4.0
    3.4.3
    registry.partek.com/flow_dna
    https://github.com/RahmanTeam/DECoN/blob/master/DECoN-v1.0.2.pdf
    https://github.com/RahmanTeam/DECoN/archive/refs/tags/v1.0.2.zip
    Minimum System Requirements
    our support page
    pip install --user cnvkit==0.9.5
    pip install --user numpy==1.19.5 Cython==0.29.30 cykhash==2.0.0 macs3==3.0.0a7
    pip install virtualenv
    virtualenv ~/.partekflow/.local
    source ~/.partekflow/.local/bin/activate
    pip install HTSeq==0.11.0
    pip install cnvkit==0.9.5
    wget customer.partek.com/python-dependencies.zip
    unizp -d ~/.partekflow/ python-dependencies.zip
    export R_LIBS_USER=$HOME/.R
    .libPaths("~/.R")
    (.packages())
    packageVersion("packageName")
    R_HOME=/path/to/R
    wget http://cran.wustl.edu/src/base/R-3/R-3.1.2.tar.gz
    tar xfz R-3.1.2.tar.gz  
    cd R-3.1.2
    ./configure --with-x=no && make
    symlink.system.packages: TRUE

    Kubernetes

    Below are the yaml documents which describe the bare minimum infrastructure needed for a functional Flow server. It is best to start with a single-node proof of concept deployment. Once that works, the deployment can be extended to multi-node with elastic worker allocation. Each section is explained below.

    The Flow headnode pod

    Pod metadata

    On a kubernetes cluster, all Flow deployments are placed in their own namespace, for example namespace: partek-flow. The label app.kubernetes.io/name: flowheadnode allows binding of a service or used to target other kubernetes infrastructure to this headnode pod. The label deployment: dev allows running multiple Flow instances in this namespace (dev, tst, uat, prd, etc) if needed and allows workers to connect to the correct headnode. For stronger isolation, running each Flow instance in its own namespace is optimal.

    Data storage

    The Flow docker image requires 1) a writable volume mounted to /home/flow 2) This volume needs to be readable and writable by UID:GID 1000:1000 3) For a multi-node setup, this volume needs to be cross mounted to all worker pods. In this case, the persistent volume would be backed by some network storage device such as EFS, NFS, or a mounted FileGateway.

    This section achieves goal 2)

    The flowconfig volume is used to override behavior for custom Flow builds and custom integrations. It is generally not needed for vanilla deployments.

    The Flow docker image

    Partek Flow is shipped as a single docker image containing all necessary dependencies. The same image is used for worker nodes. Most deployment-related configuration is set as environment variables. Auxiliary images are available for additional supporting infrastructure, such as flexlm and worker allocator images.

    Official Partek Flow images can be found on our release notes page: Release Notes The image tags assume the format: registry.partek.com/rtw:YY.MMMM.build New rtw images are generally released several times a month. The image in the example above references a private ECR. It is highly recommended that the target image from registry.partek.com be loaded into your ECR. Image pulls will be much faster from AWS - this reduces the time to dynamically allocate workers. It also removes a single point of failure - if registry.partek.com were down it would impact your ability to launch new workers on demand.

    Flow headnode resource request

    Partek Flow uses the head node to handle all interactive data visualization. Additional CPU resources are needed for this, the more the better and 8 is a good place to start. As for memory, we recommend 8 to 16 GiB. Resource limits are not included here, but are set to large values globally:

    Relevant Flow headnode environment variables

    PARTEKLM_LICENSE_FILE

    Partek Flow uses FlexLM for licensing. Currently we do not offer or have implemented any alternative. Values for this environment variable can be:

    @flexlmserveraddress

    An external flexlm server. We provide a Partek specific container image and detail a kubernetes deployment for this below. This license server can also live outside the kubernetes cluster - the only requirement being that it is network accessible. /home/flow/.partekflow/license/Partek.lic - Use this path exactly. This path is internal to the headnode container and is persisted on a mounted PVC.

    Unfortunately, FlexLM is MAC address based and does not quite fit in with modern containerized deployments. There is no straightforward or native way for kubernetes to set the MAC address upon pod/container creation, so using a license file on the flowheadnode pod (/home/flow/.partekflow/license/Partek.lic ) could be problematic (but not impossible). In further examples below, we provide a custom FlexLM container that can be instantiated as a pod/service. This works by creating a new network interface with the requested MAC address inside the FlexLM pod.

    PARTEK_COMMON_NO_TOTAL_LIMITS

    Please leave this set at "1". Partek Flow need not enforce any limits as that is the responsibility of kubernetes. Setting this to anything else may result in Partek executables hanging.

    CATALINA_OPTS

    This is a hodgepodge of Java/Tomcat options. Parts of interest:

    It is possible for the Flow headnode to execute jobs locally in addition to dispatching them to remote workers. These two options set resource limits on the Flow internal worker to prevent resource contention with the Flow server. If remote workers are not used and this remains a single-node deployment, meaning ALL jobs will execute on the internal worker, then it is best to remove the CPU limit (-DFLOW_WORKER_CORES) and only set -DFLOW_WORKER_MEMORY_MB equal to the kubernetes memory resource request.

    If Flow connects to a corporate LDAP server for authentication, it will need to trust the LDAP certificates.

    JVM heap size. If the internal worker is not used, set this to be a little less than the kubernetes memory resource request. If the internal worker is an use, and the intent is to stay with a single-node deployment, then set this to be ~ 25% of the kubernetes memory resource request, but no less than ~ 4 GiB.

    The Flow headnode service definition

    The flowheadnode service is needed 1) so that workers have a DNS name (flowheadnode) to connect to when they start and 2) so that we can attach an ingress route to make the Flow web interface accessible to end users. The app.kubernetes.io/name: flowheadnode selector is what binds this to the flowheadnode pod.

    • 80:8080 - Users interact with Flow entirely over a web browser

    • 2552:2552 - Workers communicate with the Flow server over port 2552

    • 8443:8443 - Partek executed binaries connect back to the Flow server over port 8443 to do license checks

    Ingress to flowheadnode

    This provides external users HTTPS access to Flow at host: flow.dev-devsvc.domain.com Your details will vary. This is where we bind to the flowheadnode service.

    The flexlm service pod

    The yaml documents above will bring up a complete Partek-specific license server.

    Note that the service name is flexlmserver. The flowheadnode pod connects to this license server via the PARTEKLM_LICENSE_FILE="@flexlmserver" environment variable.

    You should deploy this flexlmserver first, since the flowheadnode will need it available in order to start in a licensed state.

    Partek will send a Partek.lic file licensed to some random MAC address. When this license is (manually) written to /usr/local/flexlm/licenses, the pod will continue execution by creating a new network interface using the MAC address in Partek.lic, then it will start the licensing service. This is why the NET_ADMIN capability is added to this pod.

    The license from Partek must contain VENDOR parteklm PORT=27001 so the vendor port remains at 27001 in order to match the service definition above. Without this, this port is randomly set by FlexLM.

    This image is currently available from public.ecr.aws/partek-flow/kube-flexlm-server but this may change in the future.

    apiVersion: v1
    kind: Pod
    metadata:
      name: flowheadnode
      namespace: partek-flow
      labels:
        app.kubernetes.io/name: flowheadnode
        deployment: dev
    spec:
      securityContext:
        fsGroup: 1000
      containers:
        - name: flowheadnode
          image: xxxxxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/partek-flow:current-23.0809.22
          resources:
            requests:
              memory: "16Gi"
              cpu: 8
          env:
            - name: PARTEKLM_LICENSE_FILE
              value: "@flexlmserver"
            - name: PARTEK_COMMON_NO_TOTAL_LIMITS
              value: "1"
            - name: CATALINA_OPTS
              value: "-DFLOW_WORKER_MEMORY_MB=1024 -DFLOW_WORKER_CORES=2 -Djavax.net.ssl.trustStore=/etc/flowconfig/cacerts -Xmx14g"
          volumeMounts:
            - name: home-flow
              mountPath: /home/flow
            - name: flowconfig
              readOnly: true
              mountPath: "/etc/flowconfig"
      volumes:
        - name: home-flow
          persistentVolumeClaim:
            claimName: partek-flow-pvc
        - name: flowconfig
          secret:
            secretName: flowconfig
    spec:
      securityContext:
        fsGroup: 1000
    # This allows us to create pods with only a request set, but not a limit set. Further tuning is recommended. 
    apiVersion: v1
    kind: LimitRange
    metadata:
      name: partek-flow-limit-range
    spec:
      limits:
        - max:
            memory: 512Gi
            cpu: 64
          default:
            memory: 512Gi
            cpu: 64
          defaultRequest:
            memory: 4Gi
            cpu: 2
          type: Container
    -DFLOW_WORKER_MEMORY_MB=1024 -DFLOW_WORKER_CORES=2
    -Djavax.net.ssl.trustStore=/etc/flowconfig/cacerts
    -Xmx14g
    apiVersion: v1
    kind: Service
    metadata:
      name: flowheadnode
    spec:
      type: ClusterIP
      ports:
        - port: 80
          targetPort: 8080
          protocol: TCP
          name: http
        - port: 2552
          targetPort: 2552
          protocol: TCP
          name: akka
        - port: 8443
          targetPort: 8443
          protocol: TCP
          name: licensing
      selector:
        app.kubernetes.io/name: flowheadnode
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: flowheadnode
      annotations:
        kubernetes.io/ingress.class: "nginx"
        nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    spec:
      rules:
        - host: flow.dev-devsvc.domain.com
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: flowheadnode
                    port:
                      number: 80
    # On a NEW deployment, you need to exec into this pod and add the license file
    # to /usr/local/flexlm/licenses
    # After a license file is present, the flexlm daemon will start automatically
    
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: flexlmserver-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi     # flex.log is the only thing that slowly grows here
      storageClassName: gp2-ebs-sc
      volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: flexlmserver
    spec:
      type: ClusterIP
      ports:
        - port: 27000
          targetPort: 27000
          protocol: TCP
          name: flexmain
        - port: 27001
          targetPort: 27001
          protocol: TCP
          name: flexvendor
      selector:
        app.kubernetes.io/name: flexlmserver
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: flexlmserver
      namespace: partek-flow
      labels:
        app.kubernetes.io/name: flexlmserver
    spec:
      containers:
        - name: flexlmserver
          image: public.ecr.aws/partek-flow/kube-flexlm-server
          ports:
            - containerPort: 27000
            - containerPort: 27001
          resources:
            limits:
              memory: "256Mi"
              cpu: 1
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          volumeMounts:
            - name: flexlmserver-pvc
              mountPath: /usr/local/flexlm/licenses
      volumes:
        - name: flexlmserver-pvc
          persistentVolumeClaim:
            claimName: flexlmserver-pvc

    Single Node Amazon Web Services Deployment

    • Creating a New Elastic Compute Cloud Instance for Partek Flow Software

    • Enabling External Access to the Partek Flow Elastic Compute Cloud Instance)

    • Attaching the Amazon Elastic Block Store Volume for Partek Flow Data Storage)

    • Installing Partek Flow on a New Elastic Compute Cloud Instance

    Creating a New Elastic Compute Cloud Instance for Partek Flow Software

    Note: This guide assumes all items necessary for the Amazon elastic Comput Clout (EC2) instance does not exist, such as Amazon Virtual Private Cloud (VPC), subnets, and security groups, thus their creation is covered as well.

    Log in to the Amazon Web Services (AWS) management console at https://console.aws.amazon.com

    Click on EC2

    Switch to the region intended to deploy Partek Flow software. This tutorial uses US East (N. Virginia) as an example.

    On the left menu, click on Instances, then click the Launch Instance button. The Choose an Amazon Machine Image (AMI) page will appear.

    Click the Select button next to Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-f4cc1de2. NOTE: Please use the latest Ubuntu AMI. It is likely that the AMI listed here will be out of date.

    Choose an Instance Type, the selection depends on your budget and the size of the Partek Flow deployment. We recommend m4.large for testing or cluster front-end operation, m4.xlarge for standard deployments, and m4.2xlarge for alignment-heavy workloads with a large user-base. See the section AWS instance type resources and costs for assistance with choosing the right instance. In most cases, the instance type and associated resources can be changed after deployment, so one is not locked into the choices made for this step.

    NOTE: New instance types will become available. Please use the latest mX instance type provided as it will likely perform better and be more cost effective than older instance types.

    On the Configure Instance Details page, make the following selections:

    • Set the number of instances to 1. An autoscaling group is not necessary for single-node deployments

    • Purchasing Option: Leave Request Spot Instances unchecked. This is relevant for cost-minimization of Partek Flow cluster deployments.

    • Network: If you do not have a virtual private cloud (VPC) already created for Partek Flow, click Create New VPC. This will open a new browser tab for VPC management.

    Click the refresh arrow next to Create New Subnet and select Flow-Subnet.

    Auto-assign Public IP: Use subnet setting (Disable)

    Placement Group: No placement group

    IAM role: None.

    Note: For multi-node Partek Flow deployments or instances where you would like Partek to manage AWS resources on &#xNAN;your behalf, please see Partek AWS support and set up an IAM role for your Partek Flow EC2 instance. In most cases &#xNAN;a specialized IAM role is unnecessary and we only need instance ssh keys.

    Shutdown Behaviour: Stop

    Enable Termination Protection: select Protect against accidental termination

    Monitoring: leave Enable CloudWatch Detailed Monitoring disabled

    EBS-optimized Instance: Make sure Launch as EBS-optimized Instance is enabled. Given the recommended choice of an m4 instance type, EBS optimization should be enabled at no extra cost.

    Tenancy: Shared - Run a shared hardware instance

    Network Interfaces: leave as-is

    Advanced Details: leave as-is

    Click Next: Add Storage. You should be on Step 4: Add Storage

    For the existing root volume, set the following options:

    • Size: 8 GB

    • Volume Type: Magnetic

    • Select Delete on Termination

      • Note: All Partek Flow data is stored on a non-root EBS volume. Since only the OS is on the root volume and not frequently re-booted, a fast root volume is probably not necessary or worth the cost. For more information about EBS volumes and their performance, see the section EBS volumes.

    Click Add New Volume and set the following options:

    • Volume Type: EBS

    • Device: /dev/sdb (take the default)

    • Do not define a snapshot

    • Size (GiB): 500

    Click Next: Add Tags

    • You do not need to define any tags for this new EC2 instance, but you can if you would like.

    Click Next: Configure Security Group

    • For Assign a Security Group select Create a New Security Group

    • Security Group Name: Flow-SG

    • Description: Security group for Partek Flow server

    • Add the following rules:

    Click Review and Launch

    • The AWS console will suggest this server not be booted from a magnetic volume. Since there is not a lot of IO on the root partition and reboots are will be rare, choosing Continue with Magnetic will reduce costs. Choosing an SSD volume will not provide substantial benefit but it OK if one wishes to use an SSD volume. See the EBS Volumes section for more information.

    Click Launch

    Create a new keypair:

    • Name the keypair Flow-Key

    • Download this keypair, the run chmod 600 Flow-Key.pem (the downloaded key) so it can be used.

    • Backup this key as one may lose access to the Partek Flow instance without it.

    The new instance will now boot. Use the left navigation bar and click on Instances. Click the pencil icon and assign the instance the name Partek Flow Server

    Enabling External Access to the Partek Flow Elastic Compute Cloud Instance

    The server should be assigned a fixed IP address. To do this, click on Elastic IPs on the left navigation menu from the EC2 Management Console.

    • Click Allocate New Address

    • Assign Scope to VPC

    • Click Allocate

    On the table containing the newly allocated elastic IP, right click and select Associate Address

    • For Instance, select the instance name Flow Test Server

    • For Private IP, select the one private IP available for the Partek Flow EC2 instance, then click Associate

    Note: For the remaining steps, we refer to the elastic ip as elastic.ip

    SSH to the new Flow-Server instance:

    Attaching the Amazon Elastic Block Store Volume for Partek Flow Data Storage

    Attach, format, and move the ubuntu home directory onto the large ST1 elastic block store (EBS) volume. All Partek Flow data will live in this volume. Consult the AWS EC2 documentation for further information about attaching EBS volumes to your instance.

    Note: Under Volumes in the EC2 management console, inspect Attachment Information. It will likely list the large ST1 EBS volume as attached to /dev/sdb. Replace "s" with "xv" to find the device name to use for this mkfs command.

    Make a note of the newly created UUID for this volume

    Copy the ubuntu home directory onto the EBS volume using a temporary mount point:

    Make the EBS volume mount at system boot:

    Add the following to /etc/fstab: UUID=the-UUID-from-the-mkfs-command-above /home ext4 defaults,nofail 0 2

    Disconnect the ssh session, then log in again to make sure all is well

    Installing Partek Flow on a New Elastic Compute Cloud Instance

    Note: For additional information about Partek Flow installations, see our generic Installation Guide

    Before beginning, send the media access control (MAC) address of the EC2 instance to MAC address of the EC2 instance to [email protected]. The output of ifconfig will suffice. Given this information, Partek employees will create a license for your AWS server. MAC addresses will remain the same after stopping and starting the Partek Flow EC2 instance. If the MAC address does change, let our licensing department know and we can add your license to our floating license server or suggest other workarounds.

    Install required packages for Partek Flow:

    Install Partek Flow:

    Note: Make sure you are running as the ubuntu user.

    Partek Flow has finished loading when you see INFO: Server startup in xxxxxxx ms in the partek_flow/logs/catalina.out log file. This takes ~30 seconds.

    Alternative: Install Flow with Docker. Our base packages are located here:

    Open Partek Flow with a web browser:

    Enter license key

    Set up the Partek Flow admin account

    Leave the library file directory at its default location and check that the free space listed for this directory is consistent with what was allocated for the ST1 EBS volume.

    Done! Partek Flow is ready to use.

    Partek Amazon Web Services Support

    After the EC2 instance is provisioned, we are happy to assist with setting up Partek Flow or address other issues you encounter with the usage of Partek Flow. The quickest way to receive help is to allow us remote access to your server by sending us Flow-Key.pem and amending the SSH rule for Flow-SG to include access from IP 97.84.41.194 (Partek HQ). We recommend sending us the Flow-Key.pem via secure means. The easiest way to do this is with the following command:

    We also provide live assistance via GoTo meeting or TeamViewer if you are uncomfortable with us accessing your EC2 instance directly. Before contacting us, please run $ ./partek_flow/flowstatus.sh to send us logs and other information that will assist with your support request.

    General Recommendations

    With newer EC2 instance types, it is possible to change the instance type of an already deployed Partek Flow EC2 server. We recommend doing several rounds of benchmarks with production-sized workloads and evaluate if the resources allocated to your Partek Flow server are sufficient. You may find that reducing resources allocated to the Partek Flow server may come with significant cost savings, but can cause UI responsiveness and job run-times to reach unacceptable levels. Once you have found an instance type that works, you may wish to use reserved instance pricing which is significantly cheaper than on-demand instance pricing. Reserved instances come with 1 or 3-year usage terms. Please see the to sell or purchase existing reserved instances at reduced rates.

    The network performance of the EC2 instance type becomes an important factor if the primary usage of Partek Flow is for alignment. For this use case, one will have to move copious amounts of data back (input fastq files) and forth (output bam files) between the Partek Flow server and the end users, thus it is important to have as what AWS refers to as high network performance which for most cases is around 1 Gb/s. If the focus is primarily on downstream analysis and visualization (e.g. the primary input files are ADAT) then network performance is less of a concern.

    We recommend HVM virtualization as we have not seen any performance impact from using them and non-HVM instance types can come with significant deployment barriers.

    Make sure your instance is EBS optimized by default and you are not charged a surcharge for EBS optimization.

    T-class servers, although cheap, may slow responsiveness for the Partek Flow server and generally do not provide sufficient resources.

    We do not recommend placing any data on instance store volumes since all data is lost on those volumes after an instance stops. This is too risky as there are cases where user tasks can take up unexpected amounts of memory forcing a server stop/reboot.

    Amazon Web Services Instance Type Resources and Costs

    The values below were updated April 2017. The latest pricing and EC2 resource offerings can be found at

    Instance Type
    Memory
    Cores
    EBS throughput
    Network Performance
    Monthly cost

    Single server recommendation: m4.xlarge or m4.2xlarge

    for US-EAST-1 correspond to: Low ~ 50Mb/s, Medium ~ 300Mb/s, High ~ 1Gb/s.

    Elastic Block Store Volumes

    Choice of a volume type and size:

    This is dependent on the type of workload. For must users, the Partek Flow server tasks are alignment-heavy so we recommend a throughput optimized HDD (ST1) EBS volume since most aligner operations are sequential in nature. For workloads that focus primarily on downstream analysis, a general purpose SSD volume will suffice but the costs are greater. For those who focus on alignment or host several users, the storage requirements can be high. ST1 EBS volumes have the following characteristics:

    Max throughput 500 MiB/s

    $0.045 per GB-month of provisioned storage ($22.5 per month for a 500 GB of storage).

    Note that EBS volumes can be grown or performance characteristics changed. To minimize costs, start with a smaller EBS volume allocation of 0.5 - 2 TB as most mature Partek Flow installations generate roughly this amount of data. When necessary, the EBS volume and the (making ext4 a good choice). Shrinking is also possible but may require the Partek Flow server to be offline.

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Use the following settings for the VPC:

    • Name Tag: Flow-VPC

    • IPv4 CIDR block: 10.0.0.0/16

    • Select No IPv6 CIDR Block

    • Tenancy: Default

  • Click Yes, Create. You may be asked to select a DHCP Option set. If so, then make sure the dynamic host configuration protocol (DHCP) option set has the following properties:

    • Options: domain-name = ec2.internal;domain-name-servers = AmazonProvidedDNS;

    • DNS Resolution: leave the defaults set to yes

    • DNS Hostname: change this to yes as internal DNS resolution may be necessary depending on the Partek Flow deployment

  • Once created, the new Flow-VPC will appear in the list of available VPCs. The VPC needs additional configuration for external access. To continue, right click on Flow-VPC and select Edit DNS Resolution, select Yes, and then Save. Next, right click the Flow-VPC and select Edit DNS Hostnames, select Yes, then Save.

  • Make sure the DHCP option set is set to the one created above. If it is not, right-click on the row containing Flow-VPC and select Edit DHCP Option Sets.

  • Close the VPC Management tab and go back to the EC2 Management Console.

  • Click the refresh arrow next to Create New VPC and select Flow-VPC.

  • Click Create New Subnet and a new browser tab will open with a list of existing subnets. Click Create Subnet and set the following options:

    • Name Tag: Flow-Subnet

    • VPC: Flow-VPC

    • VPC CIDRs: This should be automatically populated with the information from Flow-VPC

    • Availability Zone: It is OK to let Amazon choose for you if you do not have a preference

    • IPv4 CIDR block: 10.0.1.0/24

  • Stay on the VPC Dashboard Tab and on the left navigation menu, click Internet Gateways, then click Create Internet Gateway and use the following options:

    • Name Tag: Flow-IGW

    • Click Yes, Create

  • The new gateway will be displayed as Detached. Right click on the Flow-IGW gateway and select Attach to VPC, then select Flow-VPC and click Yes, Attach.

  • Click on Route Tables on the left navigation menu.

  • If it exists, select the route table already associated with Flow-VPC. If not, make a new route table and associate it with Flow-VPC. Click on the new route table, then click the Routes tab toward the bottom of the page. The route Destination = 10.0.0.0/16 Target = local should already be present. Click Edit, then Click Add another route and set the following parameters:

    • Destination: 0.0.0.0/0

    • Target set to Flow-IGW (the internet gateway that was just created)

  • Click Save

  • Close the VPC Dashboard browser tab and go back to the EC2 Management Console tab. Note that you should still be on Step 3: Configure Instance Details.

  • Note: This is the minimum for ST1 volumes, see: EBS volumes

  • Volume Type: Throughput optimized HDD (ST1)

  • Do not delete on terminate or encrypt

  • SSH set Source to My IP (or the address range of your company or institution)

  • Click Add Rule:

  • Set Type to Custom TCP Rule

  • Set Port Range to 8080

  • Set Source to anywhere (0.0.0.0/0, ::/0)

    • Note: It is recommended to restrict Source to just those that need access to Partek Flow.

  • High (+10G interface)

    $97.09

    m4.xlarge

    16.0 GB

    4 vCPUs

    93.75 MB/s H

    High

    $156.950

    r4.xlarge

    30.5 GB

    4 vCPUs

    100 MB/s H

    High

    $194.180

    m4.2xlarge

    32.0 GB

    8 vCPUs

    125 MB/s H

    High

    $314.630

    r4.2xlarge

    61.0 GB

    8 vCPUs

    200 MB/s H(10G int)

    High (+10G interface)

    $388.360

    m4.large

    8.0 GB

    2 vCPUs

    56.25 MB/s M

    Medium

    $78.840

    r4.large

    15.25 GB

    2 vCPUs

    Partek Amazon Web Services Support
    General Recommendations
    Amazon Web Services Instance Type Resources and Costs
    Elastic Block Store Volumes
    https://hub.docker.com/r/partekinc/flow/tags
    http://elastic.ip:8080/
    EC2 Reserved Instance Marketplace
    http://www.ec2instances.info
    Network performance values
    underlying file system can be grown on-line
    our support page

    50 MB/s H(10G int)

    $ chmod 600 Flow-Key.pem
    $ ssh -i Flow-Testing.pem [email protected]
    $ sudo su
    $ mkfs -t ext4 /dev/xvdb
    $ mount -t ext4 /dev/xvdb /mnt/
    $ rsync -avr /home/ /mnt/
    $ umount /mnt/
    $ mount -a
    $ sudo apt-get update
    $ sudo apt-get install software-properties-common
    $ sudo add-apt-repository -y ppa:openjdk-r/ppa
    $ sudo apt-get install openjdk-8-jdk python python-pip python-dev zlib1g-dev python-matplotlib r-base python-htseq libxml2-dev perl make gcc g++ zlib1g libbz2-1.0 libstdc++6 libgcc1 libncurses5 libsqlite3-0 libfreetype6 libpng12-0 zip unzip libgomp1 libxrender1 libxtst6 libxi6 debconf 
    $ sudo pip install --upgrade pip && pip install --upgrade --upgrade-strategy eager --force-reinstall virtualenv numpy pysam cnvkit
    $ cd (we will install Partek Flow to ubuntu's home directory)
    $ wget --content-disposition packages.partek.com/linux/flow-release
    $ unzip PartekFlow*.zip
    $ ./partek_flow/start_flow.sh
    $ curl -F "[email protected]" https://installfeedback.partek.com/fupload