This chapter contains the following main sections:
- Problem Description and Objectives
- Data Description
- Loading the Data into R
- Data Visualization and Summarization
- Unknown Values
- Removing the Observations with Unknown Values
- Filling in the Unknowns with the Most Frequent Values
- Filling in the Unknown Values by Exploring Correlations
- Filling in the Unknown Values by Exploring Similarities between Cases
- Obtaining Prediction Models
- Multiple Linear Regression
- Regression Trees
- Model Evaluation and Selection
- Predictions for the Seven Algae
- Summary