Skip to content

Twifficiency scores, analyzed and visualized

While I’ve had some success with getting a few celebrities to respond or show off @TheBotLebowski to others(fred durst, taleb kweli), Yesterday, Twifficiency one-upped me and took twitter and then the national media by storm.  Fortunately for you, @jamescun, Not too many people I know read your little Time Magazine. (I really hope you’re old enough to get that!)

As some of you may or may not know, I aggregate Twitter data and then use tools such as python, hadoop, pig, and R to play with the results.  Today’s task was easy – Look through yesterday’s tweets, grab the Twifficiency auto-tweets(eeew), extract the scores, and then see if there are any interesting results.

After all of yesterday’s tweets ran through my parser, I then filtered the input data to tweets that looked like they were Twifficiency scores(where 20100817.txt is the file of yesterday’s parsed tweets that gets loaded into HDFS)

grep "My Twifficiency score is [0-9]*%. Whats yours? http://twifficiency.com/$" 20100817.txt >twif.out

I loaded twif.out through some horrible-before-the-coffee-is-even-made python code to produce a few summary statistics and then a file containing just the raw twifficiency scores:

#!/usr/bin/python
import sys,re,numpy

scores={}
scorelist=[]
# populate a dict from values 0-100 to match the existing twifficiency scores
for i in range(100):
  scores[i]=0

# load an extracted sample of my twitter data
#open a file
f = open('scores.txt','w')
for line in file('twif.out'):

  (id,ts,user,tweet)=line.strip().split('\t')

  #extract the numeric score from the tweet
  thescore=re.search('[0-9][0-9]?',tweet)
  scoreval=int(thescore.group(0))
  scorelist.append(scoreval)

  #  eventually do something with the timestamp, etc.
  #  print "%s\t%s\t%s" % (user,ts,scoreval)
  scores[scoreval] += 1

  # write the score to the raw data file
  s = "%s\n" % scoreval
  f.write(s)

a=numpy.array(scorelist)
print numpy.size(a)
print numpy.std(a)
print numpy.average(a)
f.close()

Now, the good stuff. Out of 7,089 twifficiency tweets from yesterday(the gardenhose has been severly throttled lately), the scores range from 0-99. I think I remember @jamescun mentioning the max score is %100, I haven’t seen it yet. The mean Twifficiency score is 38.5285 and the standard deviation is 11.1036. The score at the 25th percentile is  32, the median score is 39, and the score at the 75th percentile is 47.

After loading into R and plotting a histogram, the scores seem to follow a pretty normal distribution(update: it’s not-Check out @johnmyleswhite‘s comment below):

Distribution of Twifficiency scores

Although i consider myself a prominent auto-tweeter(@TheBotLebowski @HelloooooNewman @ACenterForAnts), I’m not crazy about the idea of having an app send a tweet without permission. Lets hope @jamescun fixes this or at least clarifies it a little better. Having said that, I love seeing this type of story and with James the best even if he is emo(Just kidding James!!).

edit: the raw twifficiency score data may be found here.

2 Comments

  1. This is very cool. I think your distribution looks asymmetric enough that it’s not normal. I’d try a K-S test as follows in R:

    scores <- load.data()

    m <- mean(scores)
    s <- sd(scores)

    ks.test(scores, 'pnorm', m, s)

    See this page for more info: http://sekhon.berkeley.edu/stats/html/ks.test.html

    Posted on 18-Aug-10 at 4:58 am | Permalink
  2. admin

    Thanks for the comment. As soon as I hit post, I knew you would chime in! I appreciate it.
    The ks.test() returns

    One-sample Kolmogorov-Smirnov test

    data: scores$score
    D = 0.0616, p-value < 2.2e-16
    alternative hypothesis: two-sided

    Posted on 18-Aug-10 at 5:02 am | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 Can You Buy Erexin V Over The Counter In Australia Buy Bactrim Sydney Australia Discount Prices Accutane For Sale In Uk Accutane By Mail Order Silagra Ordering Silagra Shipped From Canada Levaquin Online Without Prescription In Uk Low Prices No Prescription Bust Enhancer Online Discount Prices Buy Unisom In Australia Free Viagra Samples Cymbalta Online Without Prescription Uk Fast Shipping Cheap Pariet Fast Shipping Free Viagra Samples Best Place To Get Viagra Online Cheap Fast Nexium For Sale Canada No Prescription Drug Store Online Pharmacy England Uk Medrol Online Floxin Online Pharmacy Canada Discount Prices Cheap Valtrex In The Uk Valtrex By Mail Order Duetact Online Us Duetact Shipped From Canada Can I Get Women's Intimacy Enhancer Cream On Prescription Yes Here Buy Generic Tinidazole Online Canada Best Prices Soloxine Order Online No Prescription Soloxine Cleocin gel In Toronto Canada Free Viagra Samples Discount Luxiq Foam Online Luxiq Foam Lowest Prices Online Pharmacy In London Uk Himcolin Online Cheap Nizoral Canada Free Viagra Sample Pills Online Pharmacy Sydney Australia Zimulti Fast Buy Pariet In Los Angeles California Usa Cheap Elavil In The Us No Prescription Elavil Aygestin Online No Prescription Discount Prices Colchicine Buy Online No Prescription Discount Prices Rimonabant Jersey Usa No Prescription Rimonabant Cheap Levitra Super Active In Canada Free Viagra Sample Pills Buy Levitra Plus In New York No Prescription Levitra Plus