Advanced Python for Environmental Scientists

Schedule

Setup Download files required for the lesson
00:00 1. Introduction What are some of the common computing terms that I might encounter?
How can I use Python to work with large datasets?
How do I connect to a high performance computing system to run my code?
00:35 2. Coffee Break Break
00:50 3. Dataset Parallelism How do we apply the same command to every file or parameter in a dataset?
01:15 4. Lunch Break Break
02:15 5. Parallelisation with Numpy and Numba How can we measure the performance of our code?
How can we improve performance by using Numpy array operations instead of loops?
How can we improve performance by using Numba?
03:35 6. Coffee Break Break
03:50 7. Working with data in Xarray How do I load data with Xarray?
How does Xarray index data?
How do I apply operations to the whole or part of an array?
How do I work with time series data in Xarray?
How do I visualise data from Xarray?
05:20 8. Plotting Geospatial Data with Cartopy How do I plot data on a map using Cartopy?
06:10 9. Coffee Break Break
06:25 10. Parallelising with Dask How do we setup and monitor a Dask cluster?
How do we parallelise Python code with Dask?
How do we use Dask with Xarray?
07:45 11. Lunch Break Break
08:45 12. Storing and Accessing Data in Parallelism Friendly Formats How is the performance of data access impacted by bandwidth and latency?
How can we use an object store to store data that is accessible over the internet?
How do we access data in an object store using Xarray?
10:05 13. Coffee Break Break
10:20 14. GPUs What are GPUs and how do we access them?
How can we use a GPU with Numba?
How can we use a GPU in Pandas, Numpy or SciKit Learn?
11:20 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.