Build better training data

Overview

  • Identify the importance of preprocessing data sets
  • Introduce the recipes package for preprocessing data
  • Utilize usemodels to automatically construct code templates for common model types
  • Construct workflows for machine learning

Before class

This is not a math/stats class. In class we will briefly summarize how these methods work and spend the bulk of our time on estimating and interpreting these models. That said, you should have some understanding of the mathematical underpinnings of statistical learning methods prior to implementing them yourselves. See below for some recommended readings:

Class materials

Run the code below in your console to download the exercises for today.

usethis::use_course("cis-ds/machine-learning")

Additional readings

What you need to do after class

Benjamin Soltoff
Benjamin Soltoff
Lecturer in Information Science