DEV Community

Rohit Farmer
Rohit Farmer

Posted on

Learning R: Rant 1

I learned machine learning and data science last year with Python in my previous job, and this year I am more or less doing the same stuff in R for my current job. Learning R seems to be a pain right now, because to me coming from Python R appears to be a giant mess. There is n number of packages with multiple of them doing more or less the same thing. For example, right now I am learning about data frames/tables in R. So you have traditional R data frames that are not very easy to work with. Therefore, there is an enhanced version of it in the form of data.tables(). Former generates a data frame and the later generates a data table. In addition to the data frame and table, there is something else called tibble that is produced by the packages in the tidyverse library. Why there is not just one package like Pandas for data frames and Numpy for nd arrays?

Top comments (2)

Collapse
 
daveparr profile image
Dave Parr • Edited

Might be a bit late, but my advice:

Use tidyverse.

If you need to you can learn the differences later as under the hood a tibble is a data.frame with bells and whistles. Also, a data.table is just a data.frame too, but with different bells and whistles. I hope that's helpful rather than confusing :)

Another way of looking at it: data.frame is baked into the language at a fundamental level, even more so than numpy and pandas. It's native to base R. These other packages extend it in specific, opinionated, and usually pretty useful ways.

Collapse
 
mccurcio profile image
Matt Curcio

I love and hate for all the reasons you mentioned. Haha