Competency 8.3: Compare the performance of different models.
I compared two models, one from a unigram only feature set and the other from a unigram, bigram and trigram feature set using my test data set. I was at first using the Newsgroup data set as suggested in the Prosolo assignment, but some options were not working for me in the Explore results tab of LightSIDE. I was not sure if I could make a proper analysis without Feature weights, so I chose to use my small test data set instead. Below is the comparison of the two models:
Competency 8.4: Inspect models and interpret the weights assigned to different features as well as to reason about what these weights signify and whether they make sense.
I went to the Explore results tab to do some basic error analysis. The confusion matrix of 123 grams model was better than the 1 grams model. I looked at specific features in detail that predicted wrong categories.
E.g. The term “flowering” which had a high Feature Influence for flowers wrongly predicted a fruit which contained the term as a flower. Few terms like “genus”, “plants” did not make a correct prediction even along with its bigram and trigrams:
The data set was very small, so it did not have enough features to train the model on. There were many wrong predictions in the case of punctuation features as well. I guess that the model would do well when trained with more data using unigrams, bigrams and trigrams.