Welcome to TutorsOnSpot.Com!

World's No. 1 Assignment Writing Market

Post Your Homework

Proposals

Post your homework and get free proposals here!

Post Your Homework

Stuck in your homework and missing deadline?

Get Urgent Help In Your Essays, Assignments, Homeworks, Dissertation, Thesis Or Coursework Writing

100% Plagiarism Free Writing - Free Turnitin Report - Professional And Experienced Writers - 24/7 Online Support

Get Free 2 Pages Post Your Requirements And Get Free Help

Introduction of Sentiment analysis on the Twitter Data stream

Category: Computer Sciences Paper Type: Report Writing Reference: IEEE Words: 1400

The sentiment analysis and opinion mining are becoming an emerging technique that finds application in different areas. The process depends on the Collection of data, Analysis of data, and identification of variation in the data. In the present work, the paper discussed the Twitter microblogging and the processes associated with positive negative and irrelevant performance criteria’s. The main objective of the paper was to identify the application and research precision and applications along with the limitation faced by the data processing. Twitter facilitates the users for the micro blogging services enable them to share the messages as a tweet. In some previous researches, researchers identified that opinions shared on Twitter by the users can be applied to resolve the real-life problems [1].

The prime objective of the present work was to analyze the effectiveness of these processes and how Twitter is suitable to identify the solutions of problems and the process of classifying sentiment in tweets. The approach used in the research depends upon the areas and geographical location of the area selected in the research. The paper is classified into different segments including previous work on the selection process, techniques used in the in processing and preprocessing of the data, a classification approach, results of the research, and proposed positive directions of the research [1].

Problem statement of Sentiment analysis on the Twitter Data stream

As the technology and use of the internet are increasing and microblogging is also increasing rapidly. Extensive research is carried out in the previous result is for the identification of sentiment expressions and determination of the impact of these expressions on the tweets. Different approaches have been employed by researchers to identify lexical terms, lexical resources, and trends of tweets. Some of the researchers worked on bigram and unigram models indicate the outcomes of data collected in the research. There are different types of syntax used on the tweeter such as hashtag, explanation, punctuation, symbols, emotions, and retweets. The present research measured the influence of opinions and emotions on the tweets [1].

Data mining methodology of Sentiment analysis on the Twitter Data stream

The methodology applied by the researcher is based on two disjoint databases. The whole data was collected from Twitter and then label according to the sentiments in relation to the query. The positive and negative classification was used for the collection of data and it was based upon the expressions. The skewness in data was reduced [1]. The application programming interface was used to deal with different domains of data. The collected data and tweets were classified as neutral, irrelevant and polar based upon its nature. Special concerns were carried out for the privacy issues of the public. The technique of collecting data was accurate to deal with different type of data and huge databases [1].

Data pre-processing of Sentiment analysis on the Twitter Data stream

The preprocessing process extracted the data based on classification and information provided in the sentiment analysis and microblogging. After the collection process, the next process was to extracting the data in a series to provide a message string for conversion. The preprocessing technique was based on the classification of the quality and features of the collected data and performance of the research increased. The whole process can be classified into different steps including replacement of emotions, identification of upper cases, classification of law cases, URL extraction, detection of hashtag and pointers, identification of punctuation, compression of words, and then moving skewness from the data set. The classification and sampling of data improve the performance of research and enable to determine variation in the data and samples. Two different kinds of variations in the sampling were identified including undersampling and oversampling. The technique used synthetic minority over-sampling technique for the analysis of skewness in the data and synthetic processing of data.

The evaluation of the dataset is carried out by different processes including the experimental methodology, building of trained data model, Naïve Bayes, random forest, support to the vector machines (SVMs), algorithm of sequential mining optimization, and J48 algorithms for training dataset [1].

Summary of results of Sentiment analysis on the Twitter Data stream

The graphical representation was used to Express the relation between the performance of the algorithm and the classified data such as neutral, polar, and irrelevant data. The surprising results suggested that widely used algorithms failed to express the satisfactory performance. While on the other hand not able methods including Naïve Bayes and Lazy IBK Express the result with average accuracy. More than half that is 80% of the result accuracy was made by Bayesian classifiers, random forest, and SMO. The maximum accuracy was provided by SMO classifier. The tree-based J48 fail to express the consistency in the results. Although SMOTE technique was employed to reduce the skewness still the main issue encountered in the results was skewness of the data. Sufficiently higher accuracy was required to reduce the variation in the data and by resolving the skewness of data problem and issues faced during the Analysis of data can be reduced and performance of analysis can be increased [1].

Critical analysis of Sentiment analysis on the Twitter Data stream

The main theme of the research was to identify the impact of sentiments in the tweets. The whole process was carried out by using different methods and techniques of data analysis. The variation in the samples and skewness in the data was also measured. The critical analysis of the research suggests that the effect of negative tweets was not identified on sentiments of users.

The research encountered different problems related to the sentiment classification and details of the data. The preprocessing method for the raw twitter messages was explained elaborately. The structural defects were also identified but still, research provides insufficient information about the processes to reduce the data skewness. The percentage accuracy of each method was in measured but the process to identify a deviation in the results was that mentioned clearly. Smote technique was employed to reduce the skewness of databases and to improve the accuracy of results. The imbalance of data set introduces uncertainty in the results, therefore, the appropriate technique was required to measure the dimensions of dataset and how to overcome the issues faced due to skewness of data. The research results were insufficient regarding the privacy concerns of users. The larger proportion of Tweeter users relies on the privacy conditions for tweets, but researcher was unable to define what sort of privacy techniques are used by the Tweeter and how much people consider it as an enough condition. The accessible data for the followers must be limited particularly about the emotional state of user. The authors acknowledged the use of different techniques and how they can be developed for the analysis of data but still some important information remains undescribed. The extensive research is related to the conditions and assumptions for the classified dataset in the research but there is possibility to use blended approach of two principles. For instance, the combination of SVM and Filtered classifier in J48 can be used together to merge the method and to obtain more accuracy in the results. Besides other facts, the research classified different international relations and expressions of users but the impact of language on the emotional expressions is neglected in the research.

Conclusion of Sentiment analysis on the Twitter Data stream

Based on the result it can be concluded that the best method for the analysis of expressions in tweets was filtered classifier. The impact of emotions and sentiments on that was strong on the readers. The research proposed that accurate implementation of classification and algorithms for the dataset can improve the accuracy of results and reduce the skewness in the datasets. The research process challenges of natural language processors therefore in the future context the research can be extended to measure the impact of different languages on the emotions and tweets.

References of Sentiment analysis on the Twitter Data stream

[1]	B. Gokulakrishnan, P. Priyanthan, T. Ragavan, N. Prasath and A. Perera, "Opinion Mining and Sentiment Analysis on a Twitter Data Stream," The International Conference on Advances in ICT for Emerging Regions, vol. 01, no. 01, pp. 182-188, 2012.