This lesson is being piloted (Beta version)

C3DIS 2019 - Introduction to Machine Learning

As CSIRO embraces the transition into the technological age it has spawned a variety of digital initiatives designed to accelerate researchers’ application of modern digital advances to their technical domains. One such initiative is the CSIRO Data School program which has been designed to equip scientists with the tools necessary to apply defensible, reproducible data analytics to unique scientific datasets. This workshop will be built around a small part of the Data designed to introduce participants to the opportunities and challenges offered by the application of modern Machine Learning (ML) techniques.

Our C3DIS offering will first introduce ML and demystify the associated hype, provide a light overview of some useful ML approaches and, most importantly, equip attendees with the ability to verify and validate the results produced by their ML pipeline. We will highlight some common difficulties with real world datasets, how to identify these problems and how to rectify them.

The focus of the workshop will on applications to scientific datasets with examples including image data, time series data and regression problems. The workshop uses Python as it’s delivery vehicle and so some familiarity with the language will be assumed. We will be using the Google Collaboratory as a compute environment, so attendees are only required to bring a laptop with an internet connection. There will be limited support on offer should attendees wish to set up their own local environments.

Prerequisites

A Google account and a laptop.

Schedule

Setup Download files required for the lesson
09:30 1. Introduction What is Machine Learning?
Why would you use it?
When should you use it and when should you avoid it?
11:00 2. Coffee Break Break
11:30 3. Machine Learning Overview How should I structure a ML project
13:00 4. Lunch Break Break
14:00 5. Machine Learning Metrics for Performance How do I know if my model is correct?
15:30 6. Machine Learning Metrics for Performance How do I know if my model is correct?
17:00 7. Coffee Break Break
17:30 8. Machine Learning Model Selection and Validation How can we be sure that our model is performing as well as we think?
What are some techniques to overcome common ML issues?
18:30 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.