vrijdag 1 maart 2019


Although Kaggle focuses on data science and machine learning there is a lot of interesting knowledge and ideas data journalists can get out of this platform.

The easiest way to get data nuggets out this goldmine is by using Google Data Studio( see also: https://d3-media.blogspot.com/2019/02/and-even-more-viz-tools.html ) and or Google Sheets (Google Drive). Login to Kaggle and find a data set by keyword or tag. Click on three dots in the right top corner and choose one of the two option: sheets or data studio.

The second option for getting data is using the API( see: https://www.kaggle.com/docs/api ), choose in the menu: copy API command and use that command in the terminal.  A zip file, with data in .csv format  will be downloaded. From here you can analyze and visualize. For example by importing the data set in R studio.  But a lot of work has already been done.

Let's look for a kernel. Again search for olympics and set language to R. click on the kernel you like, and explore the coding and vizs. In the end you can  choose for downloading the code into a jupyter notebook, or use the API. (change the file type from .irnb to .ipynb) if you have the data ready you can run the code step by step.

Now let's go for the advanced stuff and dig into datascience. Could I run a kernel on the Kaggle platform? We leave data journalism and are fully using the possibilities of Kaggle.
Watch the following short videos: Getting Started on Kaggle: Python coding in Kernels   and
How to Make a Data Science Project with Kaggle 

Geen opmerkingen:

Een reactie posten

Opmerking: Alleen leden van deze blog kunnen een reactie posten.