Mining for opinions
Updated: Feb 22
Opinion mining, intuitively, can be defined as the task of gathering information to form an opinion on something. More specifically this is usually achieved by performing sentiment analysis on textual corpora gathered from the internet. I’m sure you can see how something like this has huge potential in a variety of areas. It is essentially leveraging the enormous amount of information constantly generated on the internet (albeit only in textual format in this case, this is still a lot) with powerful machine learning techniques to automate sentiment classification.
Leaving the technical issues aside for a minute, let’s take a look at where opinion mining is the most useful. Probably the first use case that comes to mind is in the context of predicting the result of an election, or any form of polling in general. This was already being done in 2012 https://arxiv.org/abs/1802.01786 . The authors of this paper were able to use data from Twitter to predict the 2012 US presidential election. Another example is the use of opinion mining in the prediction of the stock market. By consuming large quantities of news articles and experts’ financial analyses, such models are able to predict specific trends and patterns in the stock market automatically. The sheer volume of data these systems can process gives them a decisive edge over human competition. One more frontier for opinion mining is the private corporation. In fact there are endless applications for such a technology to businesses, both facing outwards as well as inwards. Understanding customers is one of them. Take for instance the example of processing customer product reviews (often sentiment analysis models are trained on this data, since it is usually well annotated with sentiment by some metric such as ‘stars’ etc…). A business would scrape the comments for their products and come up with a summarizing score for each of them based on how much customers appreciate them. This would allow the business to focus on which products work the best, adapt to the customers’ requests, and essentially listen more carefully to their customers. Obviously the same process can be applied for the discovery of new products that would do well in the market. Understanding employees, on the other hand, is one of the inwards-facing applications in business. Employee polls and surveys can be used to stimulate participation, encourage engagement and boost morale (and productivity). Opinion mining provides an extra dimension of information on top of quantitative polls and surveys.
Let’s briefly dive into the high level technical challenges posed by such a system. The first problem, as with any data-processing model is, quite obviously, gathering a lot of good data. While quantity is rarely a challenge nowadays, quality can be. Based on the specific use case data will be gathered in different places. With regards to this it is in fact important to distinguish roughly between document classification, in which case we are dealing with entire articles, blog posts etc… sentence classification, which could be news headlines, tweets, comments and word classification. Once we have a reliable channel through which to access data we can start processing it. Most commonly, the sentiment is extracted using a trained machine learning model which gives each piece of text a score. This score typically is within the range [-1; 1], 0 being neutral sentiment. It is important to have neutral sentiment to be able to filter out the objective information we might come across in the data. Another solution to this problem is performing subjectivity classification on the data beforehand, filtering out objective statements from the dataset which would introduce noise on the output.
In the specific case of QVote, sentiment analysis is performed on qualitative data that comes with surveys. This provides a controlled mechanism to gather high quality data from a group of people, regarding specific topics considered in the poll. As stated before, this adds another level of information on top of the quantitative results of the poll that can give a better understanding of the opinion on the topic at hand.
Article by Edoardo Pona - https://www.linkedin.com/in/edoardo-pona-9b2731160/