A. Describe the ImageDataGenerator()command and its associated argument. What objects and arguments, do you need to specify in order to flow from the directory to the generated object? What is the significance of specifying the target_size = as it relates to your source images of varying sizes? What considerations might you reference when programming the class mode = argument? How difference exists when applying the ImageDataGenerator () and. flow_from_directory()commands to the training and test datasets?
A.1:
The “ImageDataGenerator” class allows you to use generators of augmented image batches (and their labels) via. flow (data, labels) or. flow_from_directory(directory). The significance of detailing the target size as it relates to the image is so if you have images of different sizes, it will make them the same size (normalization). I might consider the class mode and keep it binary because there are only two types of classes and reference the validation directory. When you use ImageDataGenerator () and. flow_from_directory ()commands, you should match the arguments to their perspective directories.
A2.Q. Describe the model architecture of the horses and humans CNN as you have specified it. Did you modify the number of filters in your Conv2D layers? How do image sizes decrease as they are passed from each of your Conv2D layers to your MaxPooling2D layer and on to the next iteration? Finally, which activation function have you selected for your output layer? What is the significance of this argument’s? function within the context of your CNN’s prediction of whether an image is a horse or a human? What functions have you used in the arguments of your model compiler?
A.2.A: This Model will allow you to choose 1 or more files from your system, it will then upload them, and run them through the model, giving an indication of whether the object is a horse or a human. Using the different filter amounts the data that goes into neural networks should usually be normalized in some way to make it more responsive to processing by the network. In my case, I will preprocess my images by normalizing the pixel values to be in the [0, 1] range. I did not modify the layers or the number of filters because the image of the Governors palace in Williamsburg did properly identify the as not a horse. As the filters increased the image shrank in size by about .5. Also, the parameters increased with the number of filters as well. I ended my network with a sigmoid activation because my networker will be a scaler between 0 and 1.
B.Q:
Using the auto-mpg dataset (auto-mpg.data), upload the image where you used the
seaborn library to pairwise plot the four variables specified in your model. Describe
how you could use this plot to investigate the co-relationship amongst each of your
variables. Are you able to identify interactions amongst variables with this plot?
What does the diagonal access represent? Explain what this function is describing
with regarding to each of the variables.
B.A.1:
By uploading the pair plot, I was able to see the different variables weight, MPG, displacement, cylinders, and evaluate how they correlate. Displacement and weight had a strong positive correlation and mpg and weight had a strong negative correlation. The diagonal axis represents the distribution to the variable itself. This function is describing the relationship between the variables. for example, how some variables were inversely related, and others strongly correlated.
B.Q.2:
After running model.fit() on the auto-mpg.data data object, you returned the hist.tail() from the dataset where the training loss, MAE & MSE were recorded as well as those same variables for the validating dataset. What interpretation can you offer when considering these last 5 observations from the model output? Does the model continue to improve even during each of these last 5 steps? Can you include a plot to illustrate your answer? Stretch goal: include and describe the final plot that illustrates the trend of true values to predicted values as overlayed upon the histogram of prediction error.
B.A.2:
One of the observations that I can offer is that the model output is overfitting. As the model runs it does get more accurate, but it begins to over correct around the 3rd step.
C.Q.1
What was the significance of comparing the 4 different sized models (tiny, small, medium, large)? Can you include a plot to illustrate your answer?
C.A.1:
Because it helps us compare weather these models and allows us to recognize whether we are overfitting or underfitting. There will be a small difference, If both metrics are moving in the same direction, everything is fine. If the validation metric begins to stagnate while the training metric continues to improve, you are probably close to overfitting. If the validation metric is going in the wrong direction, the model is clearly overfitting. It was determined that the small models did not over fit and the large models did.