zondag 8 juli 2018


Data journalism is already more than fifty years old. It started in the sixties as precision journalism with Phil Meyer, then CARR computer assisted research and reporting and now data journalism. The shortest definition of data journalism is 'social science done on deadline' (Steve Dough). We incorporate the tools of the social sciences to analyze data and include them in our storytelling.
In the beginning, some 10-15 years ago, practicing data journalism needed extra skills and training. Scraping data, cleaning up and analyzing in Excel, making graphs in maps, getting data into the story, this all needed some extra journalism training. Therefore data journalism became a specialization of journalism.

The field is changing fast, and data journalism becomes a do-it-your-self toolkit that everybody can use with a minimum number of skills and understanding. Take a tool like Flourish https://app.flourish.studio/ for example: put the data in and push a button a get the graph of a map. Or the latest: workbench. Clean, scrape, analyze and visualize data without coding. A project from Columbia J-school at New York. Sign-up and get started:http://workbenchdata.com/. All the data journalism tools integrated in one package.

Reflecting on data journalism on his onlinejournalism blog, Paul Bradshaw creates two categories of data journalism training: teaching slow or fast. Teaching data journalism fast works as follows: “For many years I began my introductory data journalism classes with basic spreadsheet techniques, followed by visualization sessions to show them how to bring some of the results to life. In 2016, however, I decided to try something different: what if, instead of taking students through the process chronologically, we started at the end — and worked backwards from there? The class worked like this: students were given a spreadsheet of several tables already ready to be turned into a chart”. The new tools just mentioned not only make data journalism easy, but also clears the way for thinking about the story to be produced, and not too much about the technology and number crunching behind it.


When I switched on the Internet at the School of Journalism at the end of the eighties of the past century. I was impressed by the idea of electronic communication: ranging from e-mail to IRC chat.
This would enhance communication and understanding, and contribute to democracy. Now the opposite is the case. At the heart of their disenchantment, is that the internet has become much more “centralised” (in the tech crowd’s terminology) than it was even ten years ago”….”the system was “biased in favour of decentralisation of power and freedom to act”, writes the Economist .

From de-centralized to centralized
Instead of have direct one-on-one communication, decentralized and uncontrolled, we are working on controlled centralized systems. “These days the main way of getting online is via smartphones and tablets that confine users to carefully circumscribed spaces, or “walled gardens”, which are hardly more exciting than television channels “. It almost looks like that the times before the Internet have returned. Is Facebook so different from what was once Compuserve?

The decentralized infrastructure of the Internet is still there. On the basic level the net still runs on TCP/IP . “The connections to transfer information still exist, as do the protocols, but the extensions the internet has spawned now greatly outweigh the original network”. Not the basic level but the levels higher up are centralized and controlled. Consumer websites and all these apps. Take the social networks for example, we work on the machines of Facebook (comparable with Compuserve mainframe). “The best way to picture all this is as a vast collection of data silos with big pipes between them, connected to all kinds of devices which both deliver services and collect more data”.

Data business
How could that happen? Answer: data! “The Google search engine attracts users, which attracts suppliers of content (in Google’s case, websites that want to be listed in its index), which in turn improves the user experience, and so on. Similarly, the more people use Google’s search service, the more data it will collect, which helps to make the results more relevant. “ And the same counts for Facebook or Instagram. Data and targeted advertising are the basis of the business model which turned the Internet in a totally different beast. “Having tried to sell its technology to companies, it went for advertising, later followed by Facebook and other big internet firms. That choice meant they had to collect ever more data about their users. The more information they have, the better they can target their ads and the more they can charge for them.”

Take back control
What can we do to take back our original control over our communication on the internet? Below give a summary of 4 possible solutions based on the literature referred in the links.

maandag 11 juni 2018

vrijdag 8 juni 2018

IS DATA JOURNALISM UNDER ATTACK (Opening Media Lab speech, Peter Verweij)

Dar es Salaam June 7, 2018 Tanzania Media Fund(TMF)

Lianne Houben (in the middle)Deputy Head of Mission
at the Embassy of the Netherlands at Dar es Salaam
opening the new media lab  TMF
Photo Josh Laporte EJC

Of course you are all on Facebook, right? So you all gave Zuckerberg permission to Hoover up all your data to sell targeted advertising. In exchange you can post messages and pics to the world and to your friends. Zuckerberg: creating better communication we create we better world. This ideology is under attack one we understand the true business model behind Facebook. Not only making huge profits but through analyzing and combining the data of the users trying to influence our thinking and acting through advertising/information. Book a flight and within minutes your advised to book a car and a hotel at your destination. And it is not only Facebook but Amazon and Google as well. They all live from the use and misuse of your data. After Cambridge analytics Facebook got the full blow, the others are temporarily off the hook. The result is clear: The sole idea of data is under attack: because of privacy advertising manipulation and misuse. There is something fishy about data.

zaterdag 10 februari 2018


Earthquakes at the province of Groningen are induced by the mining of natural gas since the sixties. The KNMI has recorded and collected the data of the quakes. Inspired by Maarten Lambrechts I loaded the data into a template at Flourish gives the following time chart:

vrijdag 9 februari 2018


Flourish is an awesome tool to create charts. Its output is almost art; this could move data journalism away from its original goal: being a kind of 'sociology done on deadline', aiming at 'improving reporting by using the tools of science'. Although a chart can be made fast, easily and beautiful, the question still is what does it show and what is the meaning?
Below I show how to use R and R Studio to do an analysis of the same dataset.

loading the data set in a data frame h
Showing the structure of the data set
'data.frame':   165 obs. of  8 variables:
 $ year                : int  2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
 $ country             : Factor w/ 11 levels "Angola","Botswana",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ life.expec          : num  46.6 47.4 48.1 48.8 49.4 ...
 $ gdp.cap             : num  606 574 776 850 1136 ...
 $ code                : Factor w/ 11 levels "AGO","BWA","CMR",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Total.as.percGDP    : num  2.79 5.38 3.63 4.41 4.71 4.1 4.54 3.38 3.84 4.37 ...
 $ govperc.total.exp   : num  60.2 52.2 46.4 46.4 51.1 ...
 $ privat.perc.of.total: num  39.8 47.8 53.6 53.6 48.9 ...

donderdag 8 februari 2018


You don't have to be a highly skilled data journalist to create interesting graphs and charts. There are a large number of internet sites were you can drop your data  and retrieve in seconds awesome graphics to embed on your news blog or website. I have been working in my training with for example Datawrapper, Plotly, Tableau. Recently a tweet by Alberto Cairo draw my attention to Flourish. Amazing! Flourish, based on a cooperation with Google Newslab, easily beats the competition. And of course for free. Create an account, login, choose a template and you are in business.
I played around with it, using some data of the Worldbank.  I selected a number of Sub-Sahara countries and download life expectancy and gdp per cap from 2000 to 2014. Here is my creation, done in a few minutes.