1

Blog | Twitter Analysis In MATLAB | MATLAB Helper ®

 3 years ago
source link: https://matlabhelper.com/blog/matlab/twitter-analysis-in-matlab/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Twitter Analysis In MATLAB
Need Urgent Help?

Our experts assist in all MATLAB & Simulink fields with communication options from live sessions to offline work.

testimonials

Philippa E. / PhD Fellow

I STRONGLY recommend MATLAB Helper to EVERYONE interested in doing a successful project & research work! MATLAB Helper has completely surpassed my expectations. Just book their service and forget all your worries.

Yogesh Mangal / Graduate Trainee

MATLAB Helper provide training and internship in MATLAB. It covers many topics of MATLAB. I have received my training from MATLAB Helper with the best experience. It also provide many webinar which is helpful to learning in MATLAB.

Read another post

Tweet analysis

Twitter is a widely used social networking and micro-blogging service where most celebrities, politicians, and leaders use it to make official and formal statements directed to the public, not seen on any other platforms. It has over 330 million users as of March 2021. Hence, Twitter data is beneficial for understanding trends and people's opinions.

Tweet analysis can be performed in MATLAB to obtain trends about the tweets like popularity, the number of likes, re-tweets count, word clouds, and sentiment analysis. The prerequisites are that you must have:

1) A Twitter developer account (more info at https://developer.twitter.com/en/apply-for-access) and

2) Text Analytics toolbox installed in your current MATLAB version.

For our program, we first take the example of a well-known personality and avid Twitter user with over 48 million followers, Elon Musk, and perform some fundamental analysis of his tweets.

Retrieving twitter data and extracting required information

We first need to create a Twitter object using the Twitter function and pass your Twitter developer credentials as parameters which will look something like this:

connection = twitter(consumerKey,consumerKeySecret,accessToken,accessTokenSecret);

The Twitter object received will have a field "StatusCode" which should have a value of 200 to indicate connection successful/authorized.

Now to search for tweets from Elon Musk, we need his Twitter handle, which we can found here:

We have to make a string in the format "from:@handle" without the '@’and pass it as a search query. The search command in MATLAB will look like this:      

response = search(connection,'from:elonmusk','count',100,'lang','en');

For more information on search queries, you can refer to the documentation: https://developer.twitter.com/en/docs/twitter-api/v1/rules-and-filtering/search-operators.

All of our required data is stored in the following path:

The structure "statuses" in the path response.Body.Data.statuses have data like date and time of the tweet, number of likes, number of retweets, the mentions and hashtags, etc. We can apply cell functions to the cells of the structure fields' text',' retweet_count',' created_at' to retrieve the tweets, number of likes, and date of the tweet. Whereas hashtags, mentions, and URLs are obtained using for loops as shown:

%retrieving the tweet text,retweetcount,likes and date
tweets = cellfun(@(x) string(x.text), response.Body.Data.statuses);
retweet_count=cellfun(@(x) x.retweet_count, response.Body.Data.statuses);
likes_count=cellfun(@(x) x.favorite_count, response.Body.Data.statuses);
tweet_date=cellfun(@(x) str2num(x.created_at(9:10)), response.Body.Data.statuses);

%retrieving hashatgs
hashs= cellfun(@(x) x.entities.hashtags, response.Body.Data.statuses,'UniformOutput',false);
for i=1:numel(hashs)   
    if(isstruct(hashs{i,1}))
        hashs{i,1}=hashs{i,1}.text;
    else
        hashs{i,1}='';
    end
end

The mentions, URLs, and hashtags are stored in a separate field called entities with a 1x1 structure in the presence of the elements just mentioned. We iterate through entities and use for loops to check for structures and also retrieve data from them. After that, we can plot the retrieved information in the form of graphs to get a better representation of the data:

Using functions from the Text Analytics toolbox

Text Analytics Toolbox™ provides algorithms and visualizations for preprocessing, analyzing, and modeling text data. We can use it to make it easy to understand and intuitive representation of text data in Wordclouds, LDA topics, and sentiment analysis.

Word clouds of tweets

A wordcloud is an image composed of words used in a particular text document. The importance or frequency of a word is represented by its size in the image. We can create word clouds in MATLAB using the "wordcloud" function. This function can only be applied to a tokenized document represented as a collection of words (also known as tokens) used for text analysis. The text also needs to be preprocessed and cleaned. We can use functions such as "removeStopwords"," erasePunctuations", and "removeShortWords" etc, to clean the document, which removes stop words like 'to',' and' etc., removes punctuations, and removes 2 letter words, respectively.

 The following image shows the code and output of the wordcloud of Elon Musk's tweets.

tweetlist = tokenizedDocument(tweets,'DetectPatterns','web-address');
tweetlist = removeStopWords(tweetlist);
tweetlist = erasePunctuation(tweetlist);
tweetlist = removeShortWords(tweetlist,2);
figure
subplot(3,1,1)
wordcloud(tweetlist);
title("Wordcloud of all tweets")

The same can be done for all the hashtags and mentions by applying the same functions to the list of mentions and hashtags. Hence word clouds prove to be an easy and quick method to get an idea about what someone is tweeting about. In our case, we see that Elon Musk is tweeting about  Tesla, WholeMArsBlog, CyberpunkGame, etc.

N-grams

An n-gram is a contiguous sequence of n items(words)  from a sample text document. When n=2, it represents 2 continuous words in a document and is called bigrams. We can use the "bagOfNgrams" function to make a bigram list. When n=3, it is known as trigrams. It can be made in MATLAB using the same "bagOfNgrams" function with an additional parameterNGramLengths' set to 3. We can search for tweets from NASA using the command:

response = search(connection,'from:NASA','count',100,'lang','en')

The bi-grams present in NASA's tweets is shown in a wordcloud below.

LDA Topics

Topic modeling is a type of statistical modeling for discovering the abstract "topics" that occur in collecting documents. Latent Dirichlet Allocation (LDA) is an example of a topic model. It is used to classify text in a document to a particular topic. We can form topics and their word clouds in MATLAB using the "fitlda" function. The wordcloud of the 2 topics detected using this function is shown below.

We can incur two topics from the word clouds: ' Kate Rubins' and 'Women Nasa'. These are some of the topics present in the tokenized document identified using the fitlda() function.

Twitter sentiment analysis

Twitter Sentiment analysis is the study of whether the overall positivity and negativity ratio of an individual tweet or a collection of tweets on a topic or from a user. Scores close to 1 indicate positive sentiment, scores close to -1 indicate negative sentiment, and scores close to 0 indicate neutral sentiment. The average sentiment score of several tweets is an excellent indicating factor of the sentiment towards a particular topic. We can compare the sentiment of different topics from the plot shown below.

We see that Bill Gates mostly makes positive tweets. In contrast, the tweets with the hashtag #Corruption have an overall negative sentiment. The tweets with the text 'Olympics' seem to have a positive sentiment as well. The tweets with the hashtag '#Bitcoin' have an overall high positive sentiment score with n average score of more than 0.7. This is under the fact that bitcoin price has gone up in the recent years and particularly recent months and has drawn a lot of hype and popularity on social media platforms like Twitter.

Conclusion

We see that using MATLAB functions to form word clouds, LDA topics and perform sentiment analysis on tweets allows us to quickly analyze and see statistics of a vast amount of Twitter data. A similar analysis can be performed on other data as well. In a world where text-based communication is the standard and is drastically increasing, the volume of textual data produced increases exponentially. These analytics methods help to make sense of the massive archive of textual information obtained as we just implemented. Hence the scope of Twitter and other text-based analyses will be on the rise for years to come.

Loved our Blog Post? Give us your valuable feedback through comments!

Thank you for reading this blog. Do share this blog if you found it helpful. If you have any queries, post them in the comments or get in touch with us by emailing your questions to [email protected]. Follow us on LinkedInFacebook, and Subscribe to our YouTube Channel. 

We have expanded the traditional classroom teaching to meet the needs of today's learners. Our experts assist in all MATLAB & Simulink fields with communication options from live sessions to offline work with Pricing suitable for everyone. You can get offline help via email or opt for online zoom meetings with one-click content sharing, real-time co-annotation, and a digital whiteboard. If you are looking for one-time expert help, you can go ahead with Pay As You Go Plan. If your task is research-oriented, like thesis support or paper implementation, and you have a proper timeline, our recommendation would be Research Assistance, a monthly plan with a steady reduction of 10% of the expert fee up to six months of subscription. We also offer Corporate Assistance for requirements with annual validity. The minimum expert booking time is 1 hour under the Pay As You Go plan. You can book 5/10/20 hours under the Research Assistance plan. You will get expert help for the time you book only after you have an active order.

If you are looking for an expert's help and ready for the paid service, share your requirement with necessary attachments & inform us about any Service preference along with the timeline. Once evaluated, we will revert back to you with more details, and the next suggested step.

Education is our future. MATLAB is our feature. Happy MATLABing!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK