Flourish is an awesome tool to create charts. Its output is almost art; this could move data journalism away from its original goal: being a kind of 'sociology done on deadline', aiming at 'improving reporting by using the tools of science'. Although a chart can be made fast, easily and beautiful, the question still is what does it show and what is the meaning?
Below I show how to use R and R Studio to do an analysis of the same dataset.
setwd("/home/peter/Desktop/rdata") loading the data set in a data frame h h<-read.csv("health2.csv") Showing the structure of the data set str(h)
Making a chart for the relationship between GDP and life-expectancy using library Lattice.
This grid with different scatter diagrams, doesn't differ very much from Flourish chart.
But the interpretation of the relationships can get a better interpretation using correlations. library("plyr", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.0") Now the original data set h is split-up in subgroups(for country), and then a correlation is
str(angola) cor(angola$life.expec,angola$gdp.cap) ggplot(angola, aes(y=angola$life.expec, x=angola$gdp.cap))+ geom_point() + stat_smooth() Now let's finally look at the year 2014 for all countries.
plot(year$country,year$life.expec) abline(h=mean(year$life.expec, col="red"))
Are the difference betrween life expectancy real?
tan<-h[h$country=='Tanzania',c("life.expec")] ken<-h[h$country=='Kenya',c("life.expec")] t.test(tan,ken)
Null hypothesis H0 is not rejected; p>0,05 and means are within 95% confidence interval.
Translate scientific notation: