Sentiments about Mental Health on Twitter – Before and during the COVID-19 Pandemic

The COVID-19 pandemic not only affected the respiratory system, but, together with measures like social distancing, potentially affected the mental health of millions of people. What do Twitter (X) users talk about when they talk about #MentalHealth and #Depression? Did it change through the pandemic?

We collected almost 3,000,000 tweets from more than 300,000 Twitter users tweeting about #MentalHealth and #Depression. We applied a combination of (1) sentiment analysis, (2) topic modeling, and (3) machine-learning-based tweet classification to gain more insights.

Results:

  • Sentiment libraries LIWC, NRC, and VADER revealed overall positive sentiment in tweets about #MentalHealth
  • Topic modeling approach LDA (Latent Dirichlet Allocation) revealed that users primarily tweet to raise awareness
  • #MentalHealth tweets during the pandemic often showed an expression of gratitude
  • We build three machine learning models to see to what extent we can classify a tweet belonging to either before or during the pandemic. The BERT classifier performed the best and we achieved an accuracy of 81% for #MentalHealth and 79% for #Depression

You can find our full article published in Healthcare here (PDF).

Sentiment (from the VADER sentiment library) of all tweets before and during the pandemic. Especially during the summer, when there was some new hope, positive sentiments increased.
“Home” value of the LIWC sentiment library. Large increase at the beginning of the pandemic when the world started to urge people to stay home, trying to slow down the spreading of the virus.
The performance results of the machine learning classifiers distinguishing tweets between before and during the pandemic. A Light Gradient Boosting Machine (LGBM) classifier based on TF-IDF-values serves as the baseline. A classifier based on the sentiment values of the libraries LIWC, NRC, and VADER performed a little worse than the baseline. The BERT classifier (bert-base-cased) performed best overall.