maandag 26 juni 2017

IT IS TIME DATA JOURNALISTS LEARN TO CODE(part 2)

Creating interactive graphics is vital to data journalism stories. In my first blog post on this subject I explored the possibilities of D3, .JavaScript , R and plotly. If you want to avoid D3  and JavaScript completely and only  make use of Python, plotly has developed an interesting new library called Dash. I have been digging into this possibility using data about Dutch municipalities.
From an analysis in R I know that there is correlation between the value of the houses  and the average income for municipalities. Checking the partial correlation  and using political party as an intervening variable, the correlation does not change dramatically. Can we produce an interactive graph showing this conclusion?

Using the examples in Dash I wrote a new python script. Here it is (gemd.py):

import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd

app = dash.Dash()

app.layout = html.Div([
dcc.Graph(
id='Inkomen versus woz',
figure={
'data': [
go.Scatter(
x=df[df['PARTIJ'] == i]['GEM_INKOMEN'],
y=df[df['PARTIJ'] == i]['GEM_WOZ'],
text=df[df['PARTIJ'] == i]['GEMEENTE'],
mode='markers',
opacity=0.7,
marker={
'size': 15,
'line': {'width': 0.5, 'color': 'white'}
},
name=i
) for i in df.PARTIJ.unique()
],
'layout': go.Layout(
xaxis={'title': 'INKOMEN'},
yaxis={'title': 'WOZ'},
margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
legend={'x': 0, 'y': 1},
hovermode='closest'
)
}
)
])

if __name__ == '__main__':
app.run_server()

When we run the script in a terminal using

python gemd,py

We can inspect the result in the browser at 127.0.0.1:8050.