The election this November had an important topic on California ballots: Proposition 37, which would require the labeling of foods containing genetically modified ingredients. It ended up failing by just under 3% margin. While I didn’t support the initiative (though I don’t live in California), it was fascinating to observe both sides of the campaigns and activists on Twitter leading up to and after the election. This was an opportune time to get a glimpse at GMO activism, so I collected tweets for some basic analysis.
The tweets were collected and added to a CSV file by some Python code that I wrote, using Twitter’s streaming API. The hashtags/keywords that were followed were: #prop37, #noon37, #yeson37, GMO, prop 37, #labelgmo, and #righttoknow. I collected the: tweet, ID, datestamp, username, if a RT, user Bio from profile, profile link, follower/following/listed counts, total tweet count, and location if listed in the profile. The total dataset consisted of 253,861 tweets from October 25 to December 6. It is uploaded here if others want to use it. I decided to limit analyses to only tweets containing prop37, #prop37, #yeson37, or #noon37. This yielded 55,537 tweets that I played around with in R.
Describing the Tweeters
Here is a graph of the frequency of tweets with these hashtags by date. As you can see my internet went out several times for significant periods, but I don’t expect it to change much. I also collected the locations of each person tweeting if they set one in their profile. With these, I geocoded them to lat/long coordinates using the google maps API and did a couple plots (includes retweets). These are not adjusted for population density, but still they show much of the activity in California as expected: Here is a density map of the world, of about 60% of the locations (for whatever reason R couldn’t handle more on my computer). The log of the tweet count makes it a little easier to distinguish:
The main reason I collected the data is to try to attempt to see who is driving Prop37 activism on twitter. So I ran a number of frequencies to describe the population. On average, people tweeting with the prop37 hashtag tweeted almost 3 times, and the person who tweeted most did so 339 times.
If we run a frequency of the tweets, we see that most tweets were original or were not retweeted much. One tweet was retweeted 442 times.
As the 2 hashtags promoted by each campaign were #yeson37 and #noon37, I extracted the tweet count that contained these. The result was surprisingly 1-sided (includes all retweets). However as I show later, the #yeson37 count is artificially inflated by fake accounts.
And if you look through the #noon37 tweets, many of them are clearly in favor of Prop 37 but just adding both hashtags. It seems like this dataset is almost completely representative of people in support of Prop37. To get a more accurate picture, I wrote a script to randomly poll a sample of the people who added #prop37 to their tweet to see which side they supported, but my account was quickly banned by twitter. I also wanted to poll out of the overall twitter feed those people who list California in their location to see if this could predict the election results. If anyone knows if twitter makes exceptions to do things like this let me know, but I assume not.
So now lets look at the most frequent Prop 37 tweeters. Here are the top 20 after removing fake accounts (read further to see how I determined this). For the top 5, I went through some of the tweets and picked an example of poor information. This is cherry picking, but they are so egregious that it suggests a pattern.
|CARightToKnow||339||Official account- multiple tweets or retweets that imply health risks, e.g. “RT @MosaicMatter: I refuse to let Big Biotech damage my daughter’s childhood. I will vote #YesOn37 tomorrow.”|
|iamgreenbean||264||A retweet of the official account: “RT @CARightToKnow: Could GM foods be responsible for record low birth rate in the US? #LabelGMOs #YesOn37 t.co/MWf1HdGD“|
|Earthnik||233||This links to a youtube video that says GM food are poison and don’t work. The science says otherwise: “Seriously … here’s the actual truth about GMOs t.co/NGPTIVkb #YesOn37 #LabelGMOs”|
|bookieboo||215||Many tweets that state GM foods are harmful to health/children. Clear why she thinks this though (Jeffrey Smith is about the worst source you can find): “I’m going to be tweeting what Jeffrey Smith says. Go to t.co/FjC3pMQT to see Genetic Roulette #Mamavation #Yeson37″|
|OrganicLiveFood||205||Yikes: “Can consuming #GMO wheat cause serious damage to liver and top 10 tips and #herbals for cleansing #liver #yeson37 t.co/Xtj5WbcW“|
We can also look at the top retweeted tweets. Here are the top 20. Most are from dubious sources, some which I’ve noted in the right-hand column.
|RT @ninadobrev: If youre a Mom DEMAND GMO labels for you & amp; your family. #VoteYesProp37 (& amp;RT if youre a parent who demands it too …||442||Celebrity|
|RT @dannymasterson: BTW. #monsanto is spending a million $ a day to fight #prop37 . Chemical companies should not control our food||267||Celebrity|
|RT @mercola: It breaks my heart to hear that #Prop37 is losing the vote t.co/IvvOgcnR||159||Quack & majority funder of Prop 37|
|RT @charliesheen: GO NINA! Winning!! RT @ninadobrev If youre a Parent DEMAND GMO labels for you & amp; your family. RT #VoteYesProp37 htt …||142||Celebrity|
|RT @MotherJones: “In short, #Prop37 got crushed under fat stacks of cash: its supporters raised $8.7 mil vs. $45.6 mil for opponents”: h …||139||Many anti-GMO articles|
|RT @mouselink: And meanwhile in #California, it looks like @MonsantoCo has won out over the health of the entire human race. #prop37||139||?|
|RT @AmberLyon: California voters: “The Video Monsanto Does Not Want You to See” #Prop37 t.co/uRYhyhz3||131||Journalist|
|RT @vanderjames: Hey CA – #Prop37 would require that food with GMO’s be labeled. Vote Yes on 37 to know what’s in your food. #Simple htt …||125||Celebrity|
|RT @SophiaBush: CA Voters – regardless of who you’re voting for in the POTUS race, PLEASE Vote YES on Prop37. We have a right to know wh …||124||Celebrity|
|RT @mercola: The more you can avoid genetically modified foods, the healthier you and your family will be. Vote “Yes” on #Prop37! http:/ …||122||Celebrity|
|RT @BarbraStreisand: On Election Day I support the right to know with a #YesOn37 vote to #LabelGMOs. Get informed & amp; rock the vote: h …||119||Celebrity|
|RT @mercola: Today, #California will be voting on the most important health decision in history: #Prop37. t.co/oiZH5JbH #yeson37 …||117||Quack & majority funder of Prop 37|
|RT @AbbyMartin: It’s sad how much a disinfo $ campaign can influence people. 90% in US WANT to label #GMOs, yet CA polls show #Prop37 NO …||116||?|
|RT @vanderjames: Huge food corps have spent $40 million to scare you into thinking #Yeson37 will raise your grocery bills. It won’t. ht …||113||Celebrity|
|RT @kennyflorian: RT & amp; don’t support these “All Natural Companies” who helped defeat #Prop37 #NoToGMO #SellOuts t.co/mES42inH||112||MMA Fighter|
|RT @mercola: Your child is not a lab rat. And that’s what eating GM foods are turning them into. t.co/0tSA0uYK #gmos #prop37||107||Quack & major funder of Prop 37|
|RT @CARightToKnow: #YesOn37 is a true Peoples Movement. 90% of consumers want to #LabelGMOs so they can make informed choices: t …||104||Official campaign|
|RT @CARightToKnow: Food labels are important. Information is power and that power belongs in the hands of consumers. #LabelGMOs #YesOn37 …||102||Official campaign|
|RT @d_seaman: Why would you NOT want to know if GMOs are in your food? The corporate smear campaign against #Prop37 should be illegal. H …||99||?|
|RT @OrganicLiveFood: Syngenta’s #GMO cow food killed many cows & amp; protein used 4 cow food maize is also used in our food #YesOn37 htt …||98||Organic restaurant owner|
Inflating the Campaign
Due to evidence suggesting that the official campaign promoting prop 37 purchased followers, I was on alert for other abnormalities. It quickly became obvious that there were a large number of fake accounts (with zero followers and following counts) tweeting the same thing over and over on the #yeson37 hashtag- and linking to the campaign’s website. So after expanding the links of all tweets, I counted how many contained #yeson37, contained a link to the campaign website (carighttoknow.org), and had 0 followers: 10,209 from 965 different accounts! It appears that someone (the campaign denies it)- paid for a huge number of fake accounts to tweet the website. If you look through the tweets, they almost look normal, but the screennames are all names with random numbers at the end, are highly repetitive, and include various keywords or hashtags that were trending at the time. This is likely for 2 reasons: 1) to try to get their website out on various popular hashtags to increase awareness, 2) twitter doesn’t allow you to tweet the same thing multiple times within a short time period, so using trending hashtags would slightly change the tweets. Very shady stuff, and all of this is grounds for suspending of accounts. Annoyingly, it made more work for me. So I removed all tweets with follower and following counts of 0 (10,527), leaving 22,578. I used these to explore what links were being tweeted and for associations. Here is additional proof: by plotting the time of each tweet for each screenname, we see that most only tweet a few times and are then suspended. Below that plot is a plot of non-fake accounts.
I wrote some code to expand all links and running a frequency on them revealed pretty poor top information sources. Here are the top 50. Many suggest health risks from GMO, ignoring consensus, or propagate erroneous stories. For the top 15, I added some notes.
I ran some word associations to see what terms appeared together most frequently. Because I don’t have much RAM and R was struggling, I took a random sample of 5,000 tweets of the 44,561 (full dataset minus fake accounts) for this.
First I explored words that appeared at least 150 times:
 “california” “companies” “democracy” “food” “foods” “gmo” “gmos” “health” “label” “labelgmos” “labeling”
 “labels” “monsanto” “movement” “organic” “people” “please” “prop” “prop37″ “support” “video” “vote”
 “voters” “yeson37″
Here are some associations (values are how often the words occur together, x100 for %):
So, some evidence that people who wrote GMO in their tweets tend to think they are harmful to health, and not surprisingly Monsanto was a popular word. We can dig a bit further:
I ran each tweet through the sentiment analysis method described here. It scores each tweet by the net number of positive and negative words that it matches from a list here. The average was just above neutral, because half the tweets were neutral. Interestingly, as I analyzed the ratios of increasing sentiment (positivity and negativity), the tweets became more negative.
|Average (SD)||Neutral Tweets||Ratio of +1 to -1||Ratio of +2 to -2||Ratio of +3 to -3||Ratio of +4 to -4|
|0.16 (1.02)||26,394 (48%)||1.91||1.34||0.94||0.54|
This is easier to visualize:
Though it is difficult to draw overarching conclusions without more manual classifying of tweets, tweeters, and sources, and those that I identified (the top ranking of each) may just represent a small portion of the overall activism of Prop 37, they paint a picture of misinformation. Although some in favor of Prop 37 just don’t want corporations controlling the food supply, the top tweeters and sources of information clearly think there are health risks to GM foods. The official “Yes on 37″ campaign did nothing as far as I could tell to correct this thinking and in fact promoted it at times.
So that is just the tip of the iceberg of what could be done with this data I’m sure, and if I come across new ways of digging around I will update. Let me know if you have any ideas. I will also note that Becca Harrison is working on some qualitative analysis of prop 37 tweets, so those results will be more interesting!