Parallel Programming

Introduction

Parallelism if the running of multiple tasks concurrently, or the splitting up of a monolithic task into a series of elements that can be run concurrently.

Use on HPC Midlands Plus

Where possible you should use parallelism to exploit the full power of the HPC Midlands Plus system.

Where parallel versions of software you use (packages) exist then you should use them, but ensure that you benchmark under a set of possible conditions (types of parallelism, number of nodes) with realistic test cases to ensure that you make best value of your time allocation. For example, if you have a job that takes 1 hour on one node, but 45 minutes over two, whilst you could get your results back slightly sooner, you would get more total computation done running two jobs one two nodes.

Forms of Parallelism

Trivial

This is essentially running many single core jobs that run a single program on multiple data on multiple cores and/or nodes, which is a form of SIMD – Single Instruction Multiple Data. Care has to be taken on the HPC Midlands Plus system to ensure that you use all the cores on a node as you are charged for nodes as a whole times the number of cores on the node, irrespective of how many cores you actually use.

Single Node

Some forms of parallelism work best on a single node:

Multiple Node

This is normally supported via MPI, via

Mixed

Several of the above forms may be mixed to exploit the full performance of the system.

Workflow

In this form of parallelism different processors may peform different tasks which contribute to the overall job. This is a form of MIMD – Multiple Instruction, Multiple Data. The job scheduler supports this.