We are currently at the International Journalism Festival in Perugia in Italy and we wanted to use the occasion to showcase the way in which we find and choose most of the fake news stories that we debunk every day here at Lead Stories.
The secret sauce behind our method is the Trendolizer engine that we developed in-house. This tool indexes between 300.000 and 400.000 new links every day coming from a variety of sources and measures how well they are doing on social media like Facebook, Twitter, Pintrest... By repeatedly measuring the number of likes, tweets, video views, pins... over time the tool can calculate the current rate of increase for each measurement associated with a link. Via the Trendolizer dashboard we can then get a quasi-live view of what is trending on the internet right now.
Of course results can be further filtered and sorted: which video clips with the word 'Trump' in the title/description are currently gaining the most views? Which tweets found in the past 24 hours had the most retweets? Which stories on medium.com got the most likes this week? What were the most trending stories about April Fool's Day?
More importantly: results can also be limited to just links found in certain sources and it is possible to add one's own sources to the system:
Which is exactly what we've been doing over the past few months with hundreds of fake news websites, hyperpartisan sites, satirical websites, partisan websites, clickbait sites... By limiting results to just those sites and adding some other filters and sorting options we can quickly spot which fake news articles are taking off and which ones aren't worth debunking. We've even used Trendolizer to set up alerts in Slack that notify us every time a relatively new article from our ever expanding list of fake news sites starts getting more than a certain number of likes per hour so we can be the first fact checking website to take it down.
Very often this allows us to spot and debunk fake articles that have only been online for one or two hours. Sometimes we can even catch the authors of these articles red handed while they are still seeding them on Facebook and we can be the first ones to leave a comment under their posts linking back to our debunking article.
But there's more: since we are downloading each article anyway while we are indexing it (to get the title, description, thumbnail etc.) since a few weeks we now also use this opportunity to extract all kinds of extra information like advertising, trackers, embedded Facebook page widgets, Google Analytics ID's, domain name info, IP address of the site etc. Though our prototype domain fingerprinting tool (which will be integrated in the main Trendolizer dashboard soon) we are now able to quickly unmask the hidden connections between entire networks of fake news websites that are using the same advertising ID or which are being hosted on the same IP address. This also helps us in spotting new fake news sites that we can then add to our ever-growing list of sources.
(And since many fake news sites shamelessly copy paste entire articles from each other it is often worthwhile to use the "Story search" feature to do a quick search for a key phrase from the title of a trending fake news story to see if there are other sites posting the exact same title so these sites can potentially be added to the list too.)
Interested in seeing a full demo? Want to use Trendolizer in your own news organisation? Or maybe you want to use the Trendolizer API to import our data into your own application? You can find me at the International Journalism Festival if you happen to be there: just tweet at @mschenk to set up a meeting. Not there? Just send an email to [email protected] and let's talk!