High Performance Computing

Process research data and perform complex calculations at high speeds Facility
High performance computing (HPC) is all about scale and speed. HPC enables you to perform calculations many times faster than on a standard computer. This not only saves time, but it makes complex issues in scientific research transparent. This can lead to new opportunities, cost reduction and higher quality of research.

With HPC you can solve computational problems for which a single computer is not powerful enough. An HPC cluster runs on Linux software, a slightly more unknown operating system for computers than, for example, windows. A cluster consists of a collection of many individual computers, called nodes, that are connected to each other via a high-speed connection. All these different nodes in each cluster work in parallel with each other, increasing processing speed to deliver high-performance computing. Whether you work with human subject data or with non human subject data, there is always a HPC cluster available that fits your research.

Depending on which cluster, it is possible to gain access to bulk data storage, including back-up facilities. This way you can always be sure that your data is stored safely. Furthermore, several analysis pipelines are available and it is possible to share data securely and conveniently.

What do the HPC clusters of the UMCG offer?

  • Solving computational problems for which a single PC is not powerful enough
  • Temporary storage (tmp) to do high performance computing
  • Access to bulk data storage (Permanent storage, prm) including back-up facilities
  • Support for bioinformatics analyses
  • Several analysis pipelines are available (for DNA- , RNA-, Global Screening Array-, microbiome- and metagenomics analysis)
  • Scheduler system: calculation-jobs are scheduled and performed automatically when it is your turn and capacity is available
  • Users can be assigned to specific rights or roles
  • Data sharing is possible

Please note: Linux knowledge is required to work with the HPC Cluster of the UMCG, as you work with the command line.

Before you can use one of the HPC clusters of UMCG

  • You have to completed the associated cluster course successfully or have had an intake interview with the Genomics Coordination Center (GCC)
  • You have to be approved by the owner of a group to get access to a certain group and the dataset inside or you have to create a new one
  • To use one of the HPC clusters, knowledge of bioinformatics and working through the command line is required. To gain access to the HPC facilities on the cluster, someone must have successfully completed the associated cluster course or have had an intake interview with the GCC. Depending on interest, twice a year an introduction course is given together with RUG-CIT. Courses are not scheduled in fixed intervals, but based on demand: send us an email if you are interested in the next edition.

  • Working on the HPC clusters is carried out in groups. You have to be approved by the owner of a group to get access to the group and the dataset inside. Principle Investigators can ask for a new group to be made for their research project(s). Each group has one or more owners, one or more datamanagers and one or more members.

HPC clusters

  • Gearshift is the High Performance Computing Cluster for human subject data of the UMCG. This cluster allows you to analyze datasets containing large and/or complex data and allows you to make calculations that cannot be done on a regular computer, while complying with the rules and regulations regarding human subject data. The Gearshift cluster works with a scheduler system, which means that calculation-jobs are scheduled and performed automatically when it is your turn and capacity is available.

    The key features of the Gearshift cluster

    • Linux OS: CentOS 7.x with Spacewalk for package distribution/management.
    • Completely virtualised on an OpenStack cloud
    • Deployment of HPC cluster with Ansible playbooks under version control in a Git repo: league-of-robots
    • Job scheduling: Slurm Workload Manager
    • Account management: Local admin users+groups provisioned using Ansible.
    • Regular users+groups in a dedicated LDAP for this cluster and provisioned either with Ansible playbook too or using info from federated AAIM.
    • Module system: Lmod
    • Deployment of (Bioinformatics) software using EasyBuild

    Cluster Components
    Gearshift consists of various types of servers and storage systems. Some of these can be accessed directly by users, whereas others cannot be accessed directly.

    Costs
    The costs you have to pay for using the data storage and the Gearshift HPC cluster are determined by the amount of data you store. If you want to use the data storage you have to pay for the permanent storage (prm), automatic backup to tape is included in the price. If you want to perform HPC calculations you have to pay for the temporary storage (tmp) as well, this includes use of the compute facilities. For both the permanent and for the temporary storage the fee is 250 euros per terabyte (TB) of data storage per year. So if you want to use both storage and compute facilities you pay 500 euro per TB per year. Minimal amount is 1 TB for 1 year.

    Training
    » Information about beginners courses

    Provided by
    Genomics Coordination Center

    For whom
    UMCG and UG researchers and/or research initiated by UMCG/UG researchers

  • The Peregrine Cluster is a High Performance Computing (HPC) cluster, that allows you to analyse datasets containing large and/or complex data. It allows you to make calculations that cannot be done on a regular computer, complying with the rules and regulations that apply to non human subject data.

    Important: this cluster is offered by the University of Groningen. For more information about this HPC cluster, please contact the service desk of the Center for Information Technology of the University of Groningen.

    What does it offer?

    • Solving computational problems for which a single PC is not powerful enough
    • Three variants with different amounts of nodes (each node has an internal disk space of 1TB)
    • Attached to the cluster is 463 TB of disk space
    • Suited for tasks using more than one machine

    Costs
    Free

    Training
    Participate in the monthly course to learn the basics needed for using the Peregrine cluster
    » More information and request form

    Provided by
    University of Groningen

Contact

Genomics Coordination Center

Dept. of Genetics - UMCG
5th floor of buildings 3211 and 3226
Antonius Deusinglaan 1
9713 AV Groningen
The Netherlands