GPUs

Overview

Teaching: 35 min
Exercises: 25 min

Questions

What are GPUs and how do we access them?

How can we use a GPU with Numba?

How can we use a GPU in Pandas, Numpy or SciKit Learn?

Objectives

Understand what the difference between a GPU and CPU is and the performance implications

Apply Numba to use a GPU

Understand that there are GPU enabled replacements for many popular Python libraries

Recall that NVIDIA GPUs can be programmed in CUDA, but this is a very low level operation

What are GPUs and why should we use them?

GPUs are Graphics Processing Units, they have large numbers of very simple processing cores and are suited to some parallel tasks like machine learning and array operations
GPUs used to have to be programmed using specialised languages/libraries such as Cuda (NVIDIA proprietary) or OpenCL (cross platform and open source).
These are very low level systems that require the programmer to worry about things like moving data to/from GPU memory.
Today many higher level libraries can use GPUs reducing our need to learn Cuda or OpenCL.

How can you access a GPU if your PC doesn’t have one

Many laptops and desktops won’t have very powerful GPUs, instead we’ll want to use HPC or Cloud systems to access a GPU.
Google Colab (https://colab.research.google.com) offers a Jupyter notebook interface with GPUs for free, but the GPUs aren’t very powerful.

Using GPUs

Using GPUs with Numba

Numba can use GPUs with minor modifications to the code.
The key thing we need to do is use the @cuda.jit decorator.
The GPU has its own memory and we need to copy data to/from this.

GPU replacements for popular libraries

NVIDIA have drop in replacements for Pandas, Numpy and SciKit learn that are GPU accelerated. The replacemnt for Numpy is known as Cupy.

The following will calculate the mean of 100,000,000 random numbers using Cupy.

import cupy as cp
a = cp.random.random(100_000_000)
%time cp.mean(a)

For comparison let’s do the same using numpy as see how long it takes.

import numpy as np
a = np.random.random(100_000_000)
%time np.mean(a)

Challenge

the challenge

Solution

the solution

Key Points

GPUs are Graphics Processing Units, they have large numbers of very simple processing cores and are suited to some parallel tasks like machine learning and array operations

Many laptops and desktops won’t have very powerful GPUs, instead we’ll want to use HPC or Cloud systems to access a GPU.

Google’s Colab provides free access to GPUs with a Jupyter notebooks interface.

Numba can use GPUs with minor modifications to the code.

NVIDIA have drop in replacements for Pandas, Numpy and SciKit learn that are GPU accelerated.

previous episode

Advanced Python for Environmental Scientists

lesson home

GPUs

Overview

What are GPUs and why should we use them?

How can you access a GPU if your PC doesn’t have one

Using GPUs

Using GPUs with Numba

GPU replacements for popular libraries

Challenge

Solution

Key Points

previous episode

lesson home