2018 is already here!What a year 2017 has been!
For someone who started learning data science later this year, it feels like the year has been short.The R learning curve may seem steep however continuous exposure to different tools and libraries/packages can make your experience simpler.
In this article, I share with you R packages under different branches of data science that have made my learning journey worthwhile so far.
Data VisualizationThis is a very instrumental part of data science, for a data science newbie the ability to create great visualizations gives you the hope that you are on the right track.With great data visualizations comes a sense of appreciation for your work especially from none data scientists.The following packages will come in handy while visualizing in R.
1.ggplot2This is an R package that a makes all that work of visualization much easy. It is known as the grammar of graphics and will take care of plotting details, has different graphical options and does great graph layering.
It is available on CRAN. Here is a great ggplot2 cheat sheet to get you started: ggplot2 cheat sheet
shiny is available on CRAN.
Data WranglingOne of the goals of every data scientist should be maximizing the data analysis time.To achieve this one needs to ensure the data they are working with is as clean as possible and can be subjected to manipulation easily.Data wrangling is the process of cleaning up data, removing redundancy and organizing it in a way that makes analysis much easier.The following packages are great and simple data wrangling tools.
1.tidyrFrom the tidyr website,tidy data is defined as data where
2.dplyrWhile dealing with data, there are common manipulations that have to be carried out and dplyr helps solve these by providing verb functions to carry out these manipulations.This helps you filter your data and carry out operations that can group the data for deeper meaning. dplyr is s available on CRAN.
Data MiningThis is one of the biggest challenges for data science newbies.Although very many websites are full of open data sets and are free, It is also an accomplishing feeling for a data science newbie to learn how to extract a data set from the numerous sources of information on and off the web. The following libraries will do the magic:
1.httrThis package will enable you access data via modern web APIs. It makes use of HTTP verb functions, requests return JSON data that can be parsed as R objects and it supports Oauth. This makes it easy for a newbie working with APIs in R. This package is available on CRAN
2.rvestAn R package for web scraping. It reads HTML docs through URLs, selects parts of the document using the CSS selectors and parses HTML tables as data frames in R. This package is available on CRAN
The first days of data science can be a bit confusing, however focusing on each one of these branches can help you understand data science step by step.
I wish you a great learning experience in 2018 .Dont stop learning.
Feel free to reach out to me via twitter @lornamariak .I am happy to help and give some hype/support.Happy coding!