With the Covid-19 pandemic shaking the world for more than a year, it has been called the greatest challenge of humankind since world war two. Now, scientists are looking at new ways to gather meaningful data to track and predict future pandemics and offer valuable insights across sectors.
A recent study, published in Evolutionary Intelligence in March 2021, looks at how social media can be used by scientists to create meaningful analysis, track and predict future pandemics.
With an eye to leverage large bodies of data into meaningful insight and analysis, researchers took the large volume of data on Twitter to create a visualization and analysis of the sentiments expressed as a way to monitor and track the COVID-19 pandemic. For this, they used a deep CNN architecture — a multi-channel input to computer data analysis.
This research demonstrates that this type of software architecture has possible real-world applications not only in predicting and managing epidemics but in other sectors such as de-constructing deep fake videos on the internet.
This research is the first of its kind with the sophistication of software and a large-scale approach to analyze 600,000 tweets and then categorize sentiments such as fear, happiness, hope, depression, anger, anxiety, etc.
The goal of the researchers was to demonstrate how this can be used in various stages of pandemic analysis. The research provides four unique contributions:
- Demonstrate the use of Twitter for data analysis, specifically related to Covid-19.
- Present a gradient scale of results, comparing data for major affected countries.
- Demonstrated software capable of successfully analyzing sentiments expressed in tweets or social media.
- Create a model that allows predictive analysis of future Covid-19 case reports.
The researchers note that while other large contextual data surrounding Covid-19 has been gathered by both Google and John Hopkins University Centre for Systems Science and Engineering, their 600,000 tweets with indexed parameters provide unique insight that can be used to further research and are being made openly available.
Twitter for Research
Twitter is thought of as a best place for millions to express opinions or conduct market research, but for scientists, it is a data mine. With 330 million users worldwide, Twitter has become a first choice for data collection. It generates clusters of opinion based on region, topic, and events, making it especially useful for data analysis.
Further real-world applications of Twitter sentiment analysis include market surveys, trend analysis, rumor prediction, spam filtering, etc. with possible applications from government to business to healthcare.
Sentiment analysis looks at the feelings expressed in data, in this case through Twitter, to predict key events and ongoing trends. They used what is called common-sense knowledge infusion within the LSTM model to more accurately track the exact range of emotions expressed and to understand a person’s sentiment when more than one emotion is expressed in a tweet.
The researchers then had to fine-tune and train the model to accurately understand and categorize sentiments. The research methodology involved data collection, polarity rating, gradient scaling of cases, comparative and predictive analysis, experiments with datasets, feature extraction, sentiment labeling, and trents with gradient parameters.
Results and Application
The researchers found their methodology yielded an accuracy of 90.67%, which surpassed previous state-of-the-art work. The success of this modeling demonstrates how tweets can be collected on a large scale to analyze and model data with a high level of accuracy.
As medical data continues to evolve in the pandemic, this research can be applied to not only the Covid-19 pandemic, but in any infectious disease. Scientists can create more up-to-date and accurate prediction modeling from audio, video, and text sources. The researchers hope to continue to refine the model for increased accuracy of data representation.
We are on the frontier of large-scale multi-faceted data analysis through Twitter or social media. The long-term applications can extend beyond healthcare and infectious disease into business and market research, social sciences, government and political applications, geopolitical issues, and more.
The researchers themselves note the powerful application in the deconstruction and identification of deepfake videos, which are typically nearly impossible to detect with the human eye. These videos, which make a celebrity or political leader appear to say something they did not say, are becoming one of the great modern dangers in information wars. This is just one of many problems that can be mitigated with the deep analysis this modeling provides.
The contribution of this work is a reliable, systematic methodology of real-time data analysis and predictive software that utilizes the information being generated constantly on Twitter. Visualizing trends can provide a more realistic picture of infectious diseases and outbreaks, as well as the implications and understanding of these trends across society.