Advanced Python for Environmental Scientists

Schedule

	Setup	Download files required for the lesson
00:00	1. Introduction	What are some of the common computing terms that I might encounter? How can I use Python to work with large datasets? How do I connect to a high performance computing system to run my code?
00:35	2. Coffee Break	Break
00:50	3. Dataset Parallelism	How do we apply the same command to every file or parameter in a dataset?
01:15	4. Lunch Break	Break
02:15	5. Parallelisation with Numpy and Numba	How can we measure the performance of our code? How can we improve performance by using Numpy array operations instead of loops? How can we improve performance by using Numba?
03:35	6. Coffee Break	Break
03:50	7. Working with data in Xarray	How do I load data with Xarray? How does Xarray index data? How do I apply operations to the whole or part of an array? How do I work with time series data in Xarray? How do I visualise data from Xarray?
05:20	8. Plotting Geospatial Data with Cartopy	How do I plot data on a map using Cartopy?
06:10	9. Coffee Break	Break
06:25	10. Parallelising with Dask	How do we setup and monitor a Dask cluster? How do we parallelise Python code with Dask? How do we use Dask with Xarray?
07:45	11. Lunch Break	Break
08:45	12. Storing and Accessing Data in Parallelism Friendly Formats	How is the performance of data access impacted by bandwidth and latency? How can we use an object store to store data that is accessible over the internet? How do we access data in an object store using Xarray?
10:05	13. Coffee Break	Break
10:20	14. GPUs	What are GPUs and how do we access them? How can we use a GPU with Numba? How can we use a GPU in Pandas, Numpy or SciKit Learn?
11:20	Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.