1.) Why is using one-hot encoding an inefficient towards vectorizing a corpus of words? How are word embeddings different? (see this video https://www.youtube.com/watch?v=EEk6OiOOT2c )
A one-hot encoded vector is sparse (meaning, most indices are zero). Imagine we have 10,000 words in the vocabulary. To one-hot encode each word, we would create a vector where 99.99% of the elements are zero. This fact, makes the code very inefficient.
2.) Compile and train the model from the TensorFlow exercise. Plot the training and validation loss as well as accuracy. Post your plots and describe them.
The First thing to note about this model is how the model is overfitting. One can see this in the list of the printed-out epochs. With this approach, the model reaches a validation accuracy of around 88%, and our training accuracy (accuracy: 0.9607) is significantly higher. The larger training accuracy indicates over fitting, and The validation decrease and the steep increase also indicates over fitting.
3.Stretch Goal: Follow the link to the Embedding Projector provided at the end of the exercise. Produce the visualization of your embeddings. Interpret your visualization. What is it describing? Is there relevance with regard to words that are proximate to each other?
D.
1.) Text Classification with an RNN1.Again compile and train the model from the TensorFlow exercise. Plot the training and validation loss as well as accuracy. Stack two or more LSTM layers in your model. Post your plots and describe them.
Adding the layers did not have as dramatic of an affect as i thought it would. as you can see in the graphs bellow, little has changed. The Test accuracy in the first model was .8577600121498108 and now it is.8512399792671204.The test accuracy shows how the more LSTM layers Actually made the model a little worse then before.