Build better training data
- Identify the importance of preprocessing data sets
- Introduce the
recipespackage for preprocessing data
usemodelsto automatically construct code templates for common model types
- Construct workflows for machine learning
- Read Preprocess your data
This is not a math/stats class. In class we will briefly summarize how these methods work and spend the bulk of our time on estimating and interpreting these models. That said, you should have some understanding of the mathematical underpinnings of statistical learning methods prior to implementing them yourselves. See below for some recommended readings:
- Chapter 5 in An Introduction to Statistical Learning
- Chapters 2-3 in Hands-On Machine Learning with R
- Feature Engineering and Selection: A Practical Approach for Predictive Models
Run the code below in your console to download the exercises for today.
Materials derived from Tidymodels, Virtually: An Introduction to Machine Learning with Tidymodels by Allison Hill.
- Tidy Modeling with R - a book-length introduction to tidy modeling in R
tidymodelsLabs - complement to the 2nd edition of Introduction to Statistical Learning with translations of the labs into using the
tidymodelsset of packages.