Analysis of #FNCE tweets

The annual Academy of Nutrition and Dietetics’ (AND) Food & Nutrition Conference & Expo (FNCE) is a huge conference with an estimated 10,000+ in attendance, and there were many people people tweeting about it under the hashtag #FNCE. So I thought collecting the tweets & running some basic analyses might reveal some interesting things. If others want to poke around, I uploaded the dataset I collected here.

Methods

Unfortunately, I didn’t have the foresight to write a script to grab tweets directly from the #FNCE hashtag at the start of the conference, and because there is a limit to how many tweets you can backtrack on, I had to go about it from a very inefficient way. First, I wrote a python script to gather the twitter handles of those posting to the #FNCE hashtag starting at 0300 CST on Saturday, October 7th every 10 minutes through the end of the conference until October 10th (the conference took place October 6-9). This should represent the vast majority of tweeters, only missing those who may have tweeted or retweeted only before 0300 on Saturday and did not tweet for the rest of the conference. However, 40% of people posting to the #FNCE hashtag within the official conference start and end time did so only one time. I removed those that had 0 followers and accounts that I could not access (suspected spammers – there were 132(!)) using another python script. Finally, another script then read and searched each user’s last 400 tweets in their timeline for instances of the #FNCE hashtag. This took about 37 total hours to complete (had to scan over a half million tweets using 3 accounts to get additional API access, sorry twitter!). Tweet information and text containing the hashtag along with tweet ID, datetime, if it was a retweet, links, user bio, website, following and follower count, listed count, total tweet count, and location were added to a CSV file by this script, which I then performed some analysis with & made some graphs with in R.

Results

After restricting tweets to those posted after the official start of the conference up until the official end of the conference, there were 14512 total tweets (including old and new style retweets) from 1570 different people.

Here are the top 20 people who posted to the #FNCE hashtag (includes all retweets by the person):

 Handle Tweet Count
TheScarletRD 278
KristinaLaRueRD 261
DelMontebrand 249
LeanGrnBeanBlog 249
andybellatti 213
IamEatonWright 195
ScritchfieldRD 163
MicheleRSimon 149
JBraddockRD 147
GeneralHealthy 137
ashleyrdtx 130
LeahMcGrathRD 130
RDamber 130
SlimmingWorldSt 127
EatRightPIA 122
jessieclairecox 116
DietitianSherry 106
DianaKRice 101
ilivewell 100
Christinekw 99

Sore thumbs?

The mean, median, and standard deviation show that most actually did not tweet a lot:

Mean Median SD
9.2 2 22.3

Similarly, here are the top retweets, retweeted at least 10 times (an imperfect measure as retweets that add a message before the “RT” aren’t counted- this could probably be more precisely quantified by checking if a certain percentage of the tweet is the same as another). The top retweets are dominated by a small group of people:

 Tweet Count
RT @Fooducate: Sugary drinks are the no 1 individual source of calories in the American diet #FNCE #obesity 40
RT @eatrightFNCE: Good news, #FNCE… we’ve got the hashtag trending again! Number two nationally on Twitter. WAY TO GO! t.co/FKn 37
RT @andybellatti: “I’m so confused about nutrition! Thank goodness for Coca-Cola” #FNCE t.co/GJK3k7hV 27
RT @Fooducate: CDC’s annual national budget for nutrition education is $45M. That’s less than the budget for Pop Tarts. #FNCE #Education … 22
RT @Fooducate: “Healthy eating: education cannot by itself counteract convenience, pricing, culture, &amp amp; marketing” #FNCE 19
RT @anniehauser: 20.7% of adults meet federal exercise guidelines, but less than 1% have an “ideal diet” for #heartdisease prevention #fnce 17
RT @GeneralHealthy: If 80% of food consumed in America is unhealthy…”everything in moderation” doesn’t work! #FNCE 17
RT @RobertaAnding: For maximum muscle protein synthesis consume 30 grams of protein per meal. More isn’t better! #GoodToKnowSCAN #fnce 17
RT @andybellatti: Remember — calorie counts are Big Food’s favorite distractor. Shift attention from heinous ingredient lists. #FNCE 15
RT @DietitianSherry: Tip: I freeze pured pumpkin in ice cube trays, then can add to oatmeal. #FNCE 15
RT @andybellatti: Jess Kolko: “We don’t eat enough real food”. Someone hold me back… I feel like shouting “AMEN!” #FNCE 13
RT @Fooducate: “I don’t want McDonalds to do nutrition education to kids. They should stick to selling burgers” #FNCE 13
RT @andybellatti: Sadly missing from MyPlate messaging — “cook real food at home more often”. #FNCE Guess no sponsor wants to say that! 12
RT @eatontherun: Just because its “natural” doesn’t mean it’s good for you….snake venom is “natural” #fnce 12
RT @joyofnutrition: We should say “move more, eat smarter” not “move more, eat less” – Dr. James Hill at #FNCE 12
RT @ashleyrdtx: 50% of infants in the US paricipate in the WIC program, which mean they are born into poverty. #fnce 11
RT @davidgrotto: Much discussion of “food intolerance” at #fnce. Wish there was a session on media’s intolerance of sound science. Pray … 11
RT @KarenAnselRD: @joyofnutrition: We should say “move more, eat smarter” not “move more, eat less” – Dr. James Hill at #FNCE 11
RT @mollymorganrd: Stat: 20% of pounds gained in 1997-2007 is attributed to sugar sweetened beverages! #fnce 11
RT @eatrightFNCE: #FNCE is trending on Twitter! Way to go all. cc: @eatrightmembers 10
RT @elisazied: So stoked to be at a #fnce tweetup…RT, so we can trend worldwide!!!! #rdtweetup 10
RT @TeamNutrition: @TeamNutrition debuts new #MyPlate lessons 4 kids at #FNCE t.co/MIyCNZzh #schoolfoodsrule 10
RT @tobyamidor: Amazing! RT@bigredmannyp: Speaking of hash tags @eatsmartbd @tobyamidor @eatrightFNCE #fnce is the no. 2 TT now! http:/ … 10

There were more occurrences of “New-Style Retweets” (retweeting without adding the RT in front of the message) than “Old-Style”:

Old-Style Retweets New-Style Retweets
1287 3400

Because we have all tweets, if each tweet was an “old-style” or “new-style” retweet, and the dates for each, along with information like follower and friend count for each user, we could really do some complex analysis to visualize interactions. But that will have to wait for another time.

But since I collected the home locations of each person who tweeted with the hashtag (if they had one listed in their profile), I wrote a python script to geocode these into coordinates using the google maps API and plotted them on a map with R. Now we can visualize where people are from:

Here is only the US map:

 

Next I ran a check of frequent terms appearing at least 150 times. I did this for tweets that I stripped of usernames, hashtags, and links. These included all retweets. In alphabetical order:

[[1] “academy” “amazing” “awesome” “blog” “blogging” “booth” “breakfast” “calories” “check”
[10] “conference” “dairy” “day” “diet” “dietitians” “eat” “eating” “education” “energy”
[19] “excited” “exercise” “expo” “fat” “fnce” “food” “foods” “forward” “free”
[28] “fun” “getting” “grains” “health” “healthy” “help” “hope” “industry” “info”
[37] “kids” “krieger” “learn” “look” “looking” “loss” “love” “loved” “media”
[46] “meet” “meeting” “milk” “morning” “nutrition” “obesity” “people” “philly” “protein”
[55] “rdchat” “rds” “recipe” “reduce” “research” “session” “social” “stop” “sugar”
[64] “talk” “thank” “thanks” “time” “tips” “tweets” “twitter” “usa” “veggies”
[73] “weight” “world”

Here are the top 10 words with the number of times they are mentioned:

food 1524
session 906
nutrition 830
booth 770
rds 610
love 526
health 514
eat 502
day 442
people 430

We can visualize associations between some of these terms. Lots of positivity sprinkled in there.

A word cloud isn’t that helpful in this case as most of the terms except for fnce are mentioned are similar frequencies:

We can dig deeper and examine correlations between specified terms. As I was following the tweet stream myself I noticed it was often very conflicted over the sponsors of the conference. I queried to see what terms were most associated with some relevant terms. The numbers under the terms represent the correlation with the queried term.

“sponsors”

sponsors gimmicks controls ties reasons
1.00 0.44 0.35 0.26 0.21

“industry”

industry public margo publichealth
1.00 0.28 0.21 0.20

(Margo was a speaker at a relevant session)

“corporate”

corporate elated ago tiny minority
1.00 0.55 0.40 0.37 0.35
yrs 414 415 critical influence
0.35 0.30 0.30 0.29 0.28
715 healththroughfood minori moral ties
0.26 0.25 0.25 0.25 0.23
815
0.21

The high association with “elated” is deceiving; in context it was a popular quote “elated” at critical tweets about corporate sponsors. The 3 digit numbers come from tweets/retweets of a session that included times and room numbers.

“healthy”

This only had 1 term that was correlated more than 20%:

healthy counteract
1.00 0.21

And it appears to be from a popular tweet/retweet:

@Fooducate: “Healthy eating: education cannot by itself counteract convenience, pricing, culture, & marketing” #FNCE”

“GMO”

I was disappointed to read that there was an anti-GMO session (edit: “non GMO event” per tweets) (edit 2: this event was off-site and not part of the conference), so I was curious if this was reflected in the tweets. Indeed:

gmo prop37 nongmo nogmo ensure
1.00 0.37 0.33 0.28 0.27
error voters retailer prop informed
0.25 0.25 0.23 0.22 0.21
yeson37 affair choose isolates kraft
0.21 0.20 0.20 0.20 0.20
protecting voter
0.20 0.20

 

So it appears that there is a trend of dislike toward sponsorship, but this could be my added bias in selecting the terms. We can try a sentiment analysis of all tweets and see how positive or negative they are to add some objectivity. This scores each tweet by the number of positive or negative words it matches (positive or negative 1 per word) (match words from the opinion lexicon here, method based on this great example).

The overall sentiment average for all tweets is positive at 0.52. Then, I took a subset of only tweets that included any of the following terms: sponsor, sponsors, industry, corporate, fund, funder, funding. This resulted in 329 tweets which I ran through the sentiment analysis separately. These were lower on average (0.38), though when expressed as ratios of very positive to very negative, the differences are small. I also included sentiment on tweets containing “GMO” and “Philly” (it seems people enjoyed the location).
All Tweet Average Sentiment Average Subset of Terms on Companies Average Subset of “GMO” Tweets
0.52 (SD = 1.1) (14512 Tweets) 0.38 (SD = 1.1) (329 Tweets) 0.38 (SD = 1) (21 Tweets)
Ratio of Very + to Very – Subset Ratio of Very + to Very - Ratio of “GMO” Tweets
84 81 (too few very + or very -)
Average Subset of “Philly” Tweets
0.75 (SD = 0.75) (33 Tweets)
So it does seem these results together may suggest a general disapproval of corporate/sponsorship, confirming what I observed while watching the stream. This may have been driven by a small group of individuals; here are the top 20 people & tweet counts of this subset of tweets/retweets containing the corporate/funding terms I chose (the top 20 people tweeted 24% of this subset):
Handle Count
MicheleRSimon 29
andybellatti 15
theaspiringrd 13
GeneralHealthy 12
ashleyrdtx 9
CureT1Diabetes 7
Fooducate 7
lekkerwijn 7
AnAppleADayRD 5
HEALingFoodie 5
NutritionistaRD 5
nyshepa 5
ABenderRD 4
FCPDPG 4
HAEScoach 4
jaciwestbrook 4
JessShapiroRD 4
kellymoltzen 4
ChristysChomp 3
DianaKRice 3
I was hoping there would be enough mentions of each of the major sponsors of the AND to compare to the results of this study, but many were not talked about.

 

Here is a cool visual: frequency of tweets per 5 minutes over the conference period. Note that this is in UTC, which is 5 hours ahead of the conference time in EST. Perhaps deeper analysis could show which specific sessions generated the spikes. It would be interesting to gather feedback of attendees and compare to twitter feedback to see if twitter can accurately predict attendance or approval/disproval of sessions.

 

Finally we can plot the tweets by everyone over time to see the consistency of people joining in with their 1st tweet over time. As you can see, the people who tweeted earliest tended to tweet most frequently.

 

 

R is powerful but if you are new to it like me, it can be difficult to get it to work. Without the numerous examples shared by bloggers I would not have been able to complete these so quickly (or most likely at all). Special thanks to the following examples: wordclouds, associations, visualizing frequency, geocoding, sentiment.

Please let me know if the comments if you have any ideas or requests for additional things to look for (or tips!).

  • http://www.facebook.com/profile.php?id=728250629 Leah McGrath

    very cool to see. Yes, certain negative tweeters were very busy and many great tweets w/ good, relevatnt scientific content were not RT’d.

    • http://www.nutsci.org Colby

      Thanks for the comment Leah,

      I too was disappointed with the relatively low level of tweets about scientific talks- personally that is why I tune in to conference hashtags. Though I think it is important to point out some of the more absurd industry-influence, for example- that chocolate milk is good for kids (at least that is what I understood was the message from the tweets), industry-sponsored talks like the science of fructose were denigrated even though at least one of the speakers (Sievenpiper, not familiar with Seligson) has done very important research in the area. Though part of the reason I decided to do this is because it was fascinating how conflicted the tweets were; there were so many polarizing attitudes tweeting at once. It will be interesting to compare from year to year if this changes.

      I wish I knew of a way to separate the scientific content from the noise in the tweets, but I guess I will have to hope for bloggers in attendance to post summaries. I tried “science” as a keyword like above to observe terms that correlate but the results aren’t helpful.

  • http://www.facebook.com/kris.sollid Kris Sollid

    Fascinating analysis Colby. As a conference attendee and #FNCE tweeter, I concur completely with Leah’s comment. Also, was curious if or how you took number of followers per tweeter into account?

    • http://www.nutsci.org Colby

      Hi Kris,

      I collected the follower count, friend count, total tweet count, and number of lists each person is on but I haven’t done anything with them yet. When I get some time I will do some exploring by these.

      • http://twitter.com/SCRDinDC Kris Sollid, RD

        Great to know. So much data in SM goes untapped, so I’ll be particularly interested in your results.