Although Kaggle focuses on data science and machine learning there is a lot of interesting knowledge and ideas data journalists can get out of this platform.
Data
The easiest way to get data nuggets out this goldmine is by using Google Data Studio( see also: https://d3-media.blogspot.com/2019/02/and-even-more-viz-tools.html ) and or Google Sheets (Google Drive). Login to Kaggle and find a data set by keyword or tag. Click on three dots in the right top corner and choose one of the two option: sheets or data studio.
The second option for getting data is using the API( see: https://www.kaggle.com/docs/api ), choose in the menu: copy API command and use that command in the terminal. A zip file, with data in .csv format will be downloaded. From here you can analyze and visualize. For example by importing the data set in R studio. But a lot of work has already been done.
Kernel
Let's look for a kernel. Again search for olympics and set language to R. click on the kernel you like, and explore the coding and vizs. In the end you can choose for downloading the code into a jupyter notebook, or use the API. (change the file type from .irnb to .ipynb) if you have the data ready you can run the code step by step.
Advanced
Now let's go for the advanced stuff and dig into datascience. Could I run a kernel on the Kaggle platform? We leave data journalism and are fully using the possibilities of Kaggle.
Watch the following short videos: Getting Started on Kaggle: Python coding in Kernels and
How to Make a Data Science Project with Kaggle
Data
The easiest way to get data nuggets out this goldmine is by using Google Data Studio( see also: https://d3-media.blogspot.com/2019/02/and-even-more-viz-tools.html ) and or Google Sheets (Google Drive). Login to Kaggle and find a data set by keyword or tag. Click on three dots in the right top corner and choose one of the two option: sheets or data studio.
The second option for getting data is using the API( see: https://www.kaggle.com/docs/api ), choose in the menu: copy API command and use that command in the terminal. A zip file, with data in .csv format will be downloaded. From here you can analyze and visualize. For example by importing the data set in R studio. But a lot of work has already been done.
Kernel
Let's look for a kernel. Again search for olympics and set language to R. click on the kernel you like, and explore the coding and vizs. In the end you can choose for downloading the code into a jupyter notebook, or use the API. (change the file type from .irnb to .ipynb) if you have the data ready you can run the code step by step.
Advanced
Now let's go for the advanced stuff and dig into datascience. Could I run a kernel on the Kaggle platform? We leave data journalism and are fully using the possibilities of Kaggle.
Watch the following short videos: Getting Started on Kaggle: Python coding in Kernels and
How to Make a Data Science Project with Kaggle
Geen opmerkingen:
Een reactie posten
Opmerking: Alleen leden van deze blog kunnen een reactie posten.