Just sharing stuff…
[link to updated post here]
Around lunch earlier today, I finally succumbed to the suggestion of my good friend Chris to look at the tweets related to The Voice Kids Philippines. Tonight is the finale, if I am not mistaken [Edit: Okay, I was mistaken. They are still performing tomorrow. I guess I am not shutting down my MBA tonight.].
Anyway, I started collecting tweets at around 12:30PM today. I am still collecting, so the results discussed here only cover those collected until around 7PM-ish as shown in the plot below. The plot just shows the number of tweets posted within the time period in 30-min intervals.
Following all hashtags in the tweets collected, the occurrences for the top ten most used hashtags are shown below. Note that I had to do some pre-processing wherein certain hashtags were grouped into one— if they are referring to the same entity. To illustrate, for hashtags #godarren and #teamdarren, for example, they were “re-hashed” as #darren.
My friend and I wanted to know if data collected from Twitter (without firehose access) on TVKids Philippines can somehow be used to forecast the outcome of the finale. Will they translate into votes? I guess we’ll find out tomorrow. In the bar plot, it can be seen that #darren has the highest occurrence. This is a bit tricky though, in our opinion. Since, for Darren, this may actually have been caused by this news/gossip/buzz that if he wins, he will “donate the money to the church and the house to Lyca.” With this, the mentions cannot (likely) just be attributed solely to his performance. In addition, a tweet can have multiple hashtags with their names in it— I am counting all of them at the moment. Finally, it should be noted that in this inspection the “sentiments” of the tweets were not considered— whether they are positive or negative.
On a different note, only about 3% of the collected tweets are geotagged, and here is a visualization of the 3%. I will try to cluster the votes by region to see which names are more prominent in each of them.
IMO, it is also interesting to see the retweet network (who’s retweeting who?). Where do people typically get their information on TVKids Philippines? How big is their network? The network below is a weighted di-graph. Bigger nodes represent people whose tweets were retweeted more. As expected, those who follow The Voice Kids Philippines are also following @TheVoiceABSCBN (red node) on Twitter. The two other prominent nodes (orange) in the retweet network are @VoiceKidsUpdate (lower-left) and @MadamCharo (mid-right). I wanted to check who/what @MadamCharo is and why she appears to be a hub in the network. I suppose this is from just one of her more recent tweets. Checking, it is indeed (as of this writing). The account is a parody account of ABS-CBN’s Ms. Charo Santos.
I also noticed that none of the tweets from the official Twitter accounts of the finalists were retweeted as much as those from these top three accounts.
Super, super, important things to consider when reading this post: