Antony's Blog

Emotion Analysis using Machine Learning Part-2

June 22, 2020

Emotion Analysis using Machine Learning

2nd week of blogging

The Machine Learning models used are :

SGDClassifier

  • loss=‘hinge’ as hyperparameter
  • The model used because it converges faster than other descent techniques

Results:

Model Performance metrics:
------------------------------
Accuracy: 0.8486
Precision: 0.8429
Recall: 0.8486
F1 Score: 0.8411
Model Classification report:
------------------------------
              precision    recall  f1-score   support

       Anger       0.72      0.61      0.66      1459
         Bad       0.38      0.18      0.24      1810
      Digust       0.58      0.37      0.45       206
        Fear       0.92      0.88      0.90     12716
       Happy       0.71      0.90      0.80      7672
         Sad       0.92      0.92      0.92     16902
    Surprise       0.62      0.39      0.48       844

    accuracy                           0.85     41609
   macro avg       0.69      0.61      0.64     41609
weighted avg       0.84      0.85      0.84     41609

fig:

LogisticRegression

  • penalty=‘l2’, C=1 as hyperparameter
  • The model used because its simplest model
Model Performance metrics:
------------------------------
Accuracy: 0.8488
Precision: 0.8477
Recall: 0.8488
F1 Score: 0.8469
Model Classification report:
------------------------------
              precision    recall  f1-score   support

       Anger       0.75      0.62      0.68      1459
         Bad       0.39      0.40      0.40      1810
      Digust       0.57      0.35      0.44       206
        Fear       0.91      0.88      0.89     12716
       Happy       0.78      0.85      0.81      7672
         Sad       0.90      0.93      0.91     16902
    Surprise       0.60      0.38      0.46       844

    accuracy                           0.85     41609
   macro avg       0.70      0.63      0.66     41609
weighted avg       0.85      0.85      0.85     41609

fig:

SGDClassifier and LogisticRegression has the accuracy 0.8486.

MultinomialNB:

  • Naive Bayes family of algorithms is popular machine learning algorithms for creating text classification models.

    Model Performance metrics:
    ------------------------------
    Accuracy: 0.7892
    Precision: 0.7725
    Recall: 0.7892
    F1 Score: 0.7636
    Model Classification report:
    ------------------------------
              precision    recall  f1-score   support
    
       Anger       0.80      0.24      0.37      1459
         Bad       0.38      0.11      0.17      1810
      Digust       0.00      0.00      0.00       206
        Fear       0.85      0.85      0.85     12716
       Happy       0.79      0.74      0.76      7672
         Sad       0.77      0.94      0.84     16902
    Surprise       0.67      0.07      0.12       844
    
    accuracy                           0.79     41609
    macro avg       0.61      0.42      0.44     41609
    weighted avg       0.77      0.79      0.76     41609

fig:

Pipeline with CountVectorizer, TfidfTransformer, MultinomialNB:

  • As pipeline often gives promising results to the initial model itself.

    Model Performance metrics:
    ------------------------------
    Accuracy: 0.7225
    Precision: 0.7134
    Recall: 0.7225
    F1 Score: 0.6791
    Model Classification report:
    ------------------------------
              precision    recall  f1-score   support
    
       Anger       0.92      0.01      0.02      1459
         Bad       0.00      0.00      0.00      1810
      Digust       0.00      0.00      0.00       206
        Fear       0.85      0.78      0.81     12716
       Happy       0.88      0.51      0.64      7672
         Sad       0.64      0.96      0.77     16902
    Surprise       0.00      0.00      0.00       844
    
    accuracy                           0.72     41609
    macro avg       0.47      0.32      0.32     41609
    weighted avg       0.71      0.72      0.68     41609

fig:

MultinomialNB and .Pipeline with CountVectorizer, TfidfTransformer, MultinomialNB have accuracy 0.7892 and 0.7225.

The Deep Learning models used are:

CNN model:

  • Deep learning algorithms with CNN is expected to work well with text classification
  • CNN with and complied using loss = ‘categorical_crossentropy’ , optimizer = adam , metrics = accuracy Model:

    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    embedding_1 (Embedding)      (None, 157, 500)          54695500  
    _________________________________________________________________
    conv1d_1 (Conv1D)            (None, 157, 128)          192128    
    _________________________________________________________________
    conv1d_2 (Conv1D)            (None, 157, 64)           24640     
    _________________________________________________________________
    conv1d_3 (Conv1D)            (None, 157, 32)           4128      
    _________________________________________________________________
    conv1d_4 (Conv1D)            (None, 157, 16)           1040      
    _________________________________________________________________
    flatten_1 (Flatten)          (None, 2512)              0         
    _________________________________________________________________
    dropout_1 (Dropout)          (None, 2512)              0         
    _________________________________________________________________
    dense_1 (Dense)              (None, 100)               251300    
    _________________________________________________________________
    dropout_2 (Dropout)          (None, 100)               0         
    _________________________________________________________________
    dense_2 (Dense)              (None, 7)                 707       
    =================================================================
    Total params: 55,169,443
    Trainable params: 55,169,443
    Non* trainable params: 0

    trained the model which produced Accuracy: 84.063542, Loss: 45.061589. Which is less the ML models.

fig:

Bidirectional LSTM model:

  • removed the spatial dropout layer as it decreases the accuracy of the model, which is inferred from training
  • CNN with and complied using loss = ‘categorical_crossentropy’ , optimizer = adam , metrics = accuracy

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_1 (InputLayer)            [(None, 157)]        0                                            
    __________________________________________________________________________________________________
    embedding (Embedding)           (None, 157, 500)     54695500    input_1[0][0]                    
    __________________________________________________________________________________________________
    bidirectional (Bidirectional)   (None, 157, 256)     644096      embedding[0][0]                  
    __________________________________________________________________________________________________
    conv1d (Conv1D)                 (None, 155, 64)      49216       bidirectional[0][0]              
    __________________________________________________________________________________________________
    global_average_pooling1d (Globa (None, 64)           0           conv1d[0][0]                     
    __________________________________________________________________________________________________
    global_max_pooling1d (GlobalMax (None, 64)           0           conv1d[0][0]                     
    __________________________________________________________________________________________________
    concatenate (Concatenate)       (None, 128)          0           global_average_pooling1d[0][0]   
                                                                 global_max_pooling1d[0][0]       
    __________________________________________________________________________________________________
    dense (Dense)                   (None, 7)            903         concatenate[0][0]                
    ==================================================================================================
    Total params: 55,389,715
    Trainable params: 55,389,715
    Non-trainable params: 0

The Bidirectional LSTM has the highest accuracy of all the models Accuracy: 86.488497, Loss: 36.384022

fig:

I also found from the observation that the LSTM model works better for Sad, Happy, and Fear but the SGDClassifier works better on Anger, Bad, Surprise hence joined the model together making it a stacked model which provided a accuracy:

  • Accuracy: 0.883606912
  • Precision: 0.8874248006
  • Recall: 0.883606912
  • F1 Score: 0.8844271224

fig:

Preprocess:
https://colab.research.google.com/drive/1NZBf4iDojvel3dHeU9AYzWWt5XM8Eggr
MachineLearning:
https://colab.research.google.com/drive/1UI4LPO9BfrkJYXEVudsii7XRXeVJXvJJ#scrollTo=HP6QeD8vzjKc
Deep Learning:
https://colab.research.google.com/drive/1gauXKeBIN9Piln_2_OxmLsk5g8U9f1f-#scrollTo=TKyIf6PhJRfE
Stacked Model:
https://colab.research.google.com/drive/191BIy6tGDRfIi6eKNFhLM2IN7quwFdyr#scrollTo=ROunJKQlz_Lr