# Build better training data

## Overview

- Identify the importance of preprocessing data sets
- Introduce the
`recipes`

package for preprocessing data - Utilize
`usemodels`

to automatically construct code templates for common model types - Construct workflows for machine learning

## Before class

- Read Preprocess your data

This is not a math/stats class. In class we will **briefly** summarize how these methods work and spend the bulk of our time on estimating and interpreting these models. That said, you should have some understanding of the mathematical underpinnings of statistical learning methods prior to implementing them yourselves. See below for some recommended readings:

- Chapter 5 in
*An Introduction to Statistical Learning* - Chapters 2-3 in
*Hands-On Machine Learning with R* *Feature Engineering and Selection: A Practical Approach for Predictive Models*

## Class materials

Run the code below in your console to download the exercises for today.

```
usethis::use_course("cis-ds/machine-learning")
```

Materials derived from Tidymodels, Virtually: An Introduction to Machine Learning with Tidymodels by Allison Hill.

### Additional readings

`caret`

`tidymodels`

*Tidy Modeling with R*- a book-length introduction to tidy modeling in R- ISLR
`tidymodels`

Labs - complement to the 2nd edition of Introduction to Statistical Learning with translations of the labs into using the`tidymodels`

set of packages.