sábado, 25 de abril de 2020

Colombia Covid19 Pipeline and Datasets

Hi everybody,

In this post, I want to share my auto-learning process about data science, since two weeks ago I have been working in the Covid19 dataset from Colombia, and now I have a dataset that I want to share with the open community.

The Project: Colombia Covid19 Pipeline

Pipeline to get data sources from Instituto Nacional de Salud - INS related to Covid19 cases daily report in Colombia to create datasets.

Context


The number of new cases is increasing day by day around the world. This dataset has information about reported cases from 32 Colombia departments.


Here you can find the result from my auto-learning process about data science, this dataset has a daily report from Instituto Nacional de Salud - INS about Covid19 cases reported in Colombia, also has a history report from Instituto Nacional de Salud - INS about Covid19 Samples processed in Colombia.

Content

This dataset uses the INS Covid19 report data source, I did clean the data source and fill the NaN values to generate this dataset with additional attributes like, day of the week, year, and month of the year.

covid19co.csv -> Daily report, Cases reported in Colombia

covid19co_samples_processed.csv -> Daily report, Samples processed in Colombia

This dataset is updated from an automatic pipeline, you can find the GitHub code repository here: Colombia Covid19 Pipeline

Acknowledgements

Dataset is obtained from Instituto Nacional de Salud - INS daily report Covid19 in Colombia. You can get the official dataset here: INS - Official Report

Inspiration

What questions do you want to see answered?

You can view and collaborate with the analysis here: colombia_covid_19_analysis Kaggle Notebook Kernel.




Work in progress ...


No hay comentarios.:

Publicar un comentario