A Comparative Study for Supervised Learning Algorithms to Analyze Sentiment Tweets
Keywords:Social Networks, Data Mining, Sentiment Analysis, Opinion Mining, Confusion Matrix
Twitter popularity has increasingly grown in the last few years, influencing life’s social, political, and business aspects. People would leave their tweets on social media about an event, and simultaneously inquire to see other people's experiences and whether they had a positive/negative opinion about that event. Sentiment Analysis can be used to obtain this categorization. Product reviews, events, and other topics from all users that comprise unstructured text comments are gathered and categorized as good, harmful, or neutral using sentiment analysis. Such issues are called polarity classifications. This study aims to use Twitter data about OK cuisine reviews obtained from the Amazon website and compare the effectiveness of three commonly used supervised learning classifiers, Naive Bayes, Logistic Regression, and Support Vector Machine. This is achieved by using two method of feature selection involving count Vectorizer and Term-Frequency-Inverse Data Frequency. The findings showed that the support vector machine classifier had achieved the highest accuracy of 91%, by feature selection: Count Vectorizer. But it is time consuming. For both accuracy and execution time concentrates, logistic regression is recommended.