DataTaunew | comments | leaders | submitlogin
2 points by gipp 3268 days ago | link | parent

As someone who's never spent much time using R, I'm always curious about the strength of CRAN over PyPI always being mentioned as one of R's main advantages. I don't recall ever wanting to try an approach and not finding something relevant on PyPI (99% of the time some combination of statsmodels, pandas, pymc and/or sklearn gets it done easily).

Can someone give me some examples of where there are "no module replacements for the 100s of essential R packages"? The idea of Python's massive ecosystem somehow being a negative is strange to me.



2 points by TheCartographer 3265 days ago | link

I'm not sure about "100s of essential R packages." If there are 100 essential R packages, then to me this would suggest R isn't doing what it is supposed to do and the users are writing functionality to get around that. I think there are probably 5, maybe 10 essential R packages (devtools, ggplot2, reshape2, pick-your-poison-ODBC-driver package, a few others).

What R brings to the table, just like any other FOSS ecosystem, is it's community. And the community for R is academic and other high level statisticians and researchers. And it's a big community: 6666 packages in the CRAN repository as of today, plus stuff on other repo systems like GitHub and spinoffs like bioconductor.

The majority of those packages are of limited use to the general user. Where their strength lies is in the specialist implementations for sepecific algorithms/analytics/tools etc.

So whether it's a standard analytic technique for a specific field of study, or a cutting edge technique that is just becoming a topic of research in the literature, someone has probably implemented it in R already.

Python will always be more effective for general use, data manipulation, I/O, etc. It's a great Swiss Army knife.

R is a poor Swiss Army knife, but a great scalpel. If you would rather use someone's N-space vector decomposition or cutting edge classification algorithms rather than going to the trouble of implementing your own, R is awesome. There's a package that implements Author et al 2010 already.

For more general work, Python is king.

-----




RSS | Announcements