R

Car Sales. Consider the data on used cars (mlba::ToyotaCorolla ) with 1436 recor

Photo of author

By admin

Car Sales. Consider the data on used cars (mlba::ToyotaCorolla ) with 1436 records and details on 38 variables, including Price, Age, KM, HP, and other specifications. The goal is to predict the price of a used Toyota Corolla based on its specifications.
Use predictors Age_08_04, KM, Fuel_Type, HP, Automatic, Doors, Quarterly_Tax, Mfr_Guarantee, Guarantee_Period, Airco, Automatic_airco, CD_Player, Powered_Windows, Sport_Model, and Tow_Bar.
To ensure everyone gets the same results, use the following code to convert categorical predictors to dummies, create training and holdout data sets, and normalize the training set and holdout set. Note the holdout set is normalized by using the training set.
# load the data and preprocess
toyota.df <- mlba::ToyotaCorolla toyota.df <- mlba::ToyotaCorolla %>%
mutate(
Fuel_Type_CNG = ifelse(Fuel_Type == “CNG”, 1, 0),
Fuel_Type_Diesel = ifelse(Fuel_Type == “Diesel”, 1, 0)
)

# partition
set.seed(1)
idx <- createDataPartition(toyota.df$Price, p=0.6, list=FALSE) train.df <- toyota.df[idx, ] holdout.df <- toyota.df[-idx, ] #Normalize the dataset. Use the training set to determine the normalization. normalizer <- preProcess(train.df, method="range") train.norm.df <- predict(normalizer, train.df) holdout.norm.df <- predict(normalizer, holdout.df) Fit a neural network model to the data. Use a single hidden layer with two nodes. Record the RMS error for the training data and the holdout data. Repeat the process, changing the number of hidden layers and nodes to single layer with 5 nodes, and two layers, 5 nodes in each layer. What happens to the RMS error for the training data as the number of layers and nodes increases? What happens to the RMS error for the holdout data? Comment on the appropriate number of layers and nodes for this application.