Skip to content

World Cup 2010: Analysis of tweets celebrating goals

The 2010 World Cup proved to be one of the most tweeted about events of 2010.  Through the millions of tweets saved to my local Cloudera CDH3 Hadoop cluster, I wrote a quick pig script to discover the ways that people are celebrating(ok, spelling) goals.  Here are the top few variations of Goal/Gol.  The full results can be found here.  I’m happy to share the Pig code if anyone is interested.

158636 gol
126669 goal
31722 Goal
24735 Gol
19610 GOL
14317 GOAL
4178 gool
2981 ggol
2219 goll
1771 goool
1641 GOOOL
1564 Gooool
1498 Goool
1279 GOOOOL
1188 Goooool
1158 GOOOOOOOL
1124 GOOOOOL
1116 gooool
1075 GOOL

A few rough visualizations, they are in need of an update. The first is a scatter between the length of the GOAL and the number of Os contained. You’ll see that a lot of excited soccer fans like to use the full 140 characters in their celebrations, often using 138 Os. Huge thanks to @johnmyleswhite for the inspiration and the R/ggplot2 help. You don’t even want to see my original version!

Number of Os in GOAL vs GOAL length

Inspired by @peteskomoroch, who was inspired by the frequencies of the length of mentions of KAAAAAAHN! here are the frequencies of the length of GOAL! celebrations found on twitter.

A visualization of the hashtag mentions by country through all of the World Cup Tweets.  Click for full-sized version.

Counts of country hastags

And my personal favorite visualization, the number of tweets mentioning vuvuzela, hourly, during the course of the world cup.

Vuvuzela tweets per hour

One Trackback/Pingback

  1. [...] This is a fun image I found on Neil Kodner’s blog: [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*