DataTaunew | comments | leaders | submitlogin
R vs Python: Head to head data analysis (dataquest.io)
23 points by vikp 3116 days ago | 5 comments


2 points by SixSigma 3111 days ago | link

Unfortunate mis-use of the term "Functional" to mean "Procedural", it damages my trust in the author's knowledge.

-----

2 points by mikeskim 3116 days ago | link

i think people are moving to python in the competitive predictive analytics community because of neural nets (theano, nolearn, lasagne, neon, keras, etc.). i have yet to see any useful deep learning packages in R. at this point using R almost requires basic proficiency in libraries like data.table making the learning curve even steeper for non native users.

-----

2 points by debrouwere 3116 days ago | link

> using R almost requires basic proficiency in libraries like data.table

I think a lot of people just use plain old data frames, together with plyr 2 – which is really not any harder to work with than Pandas' `DataFrame#groupby` functionality, though I will grant that having to look for it in a separate library is a bit annoying for newcomers.

OTOH, In Python you will find that everything in the NumPy/SciPy universe works one way (vectorized functions, NA objects, its own float type and so on) and everything in plain Python another way. Not exactly ideal either.

-----

1 point by whitebear 3114 days ago | link

several of the R chunks are unnecessarily complex. there's base::colMeans, stats::kmeans, graphics::pairs, etc. R's got the rvest package now, which is comparable to BeutifulSoup. Having used both extensively for database related work, "saving to databases" is as easy in R as it is in Python, and maybe even easier with the ORM-like support in the dplyr package. Python's been playing catch-up in the data analysis sphere lately, but with every gain that Python makes, R is already going to be that much further ahead. However, I second the other commenter's remark that Python is ahead in the deep learning domain.

-----

1 point by lackadaisically 3115 days ago | link

R seems much more easier for quick statistic, modeling, and rapid prototyping.

Python would be better in general minus some statistical libraries and loses some ease of built-in stat function that R have.

I could see myself investing much more in R for my statistical career if I were mainly a statisticians that does data cleaning and data modeling.

Python would be more well rounded if I were in a position that is more computer science/programming. Or if I want more speed I would move my code from R to Python.

BTW, I'm going to choose R for now for my statistic graduate major, there are many R books and stats books that uses R.

I'll move to Python when/as needed.

-----




RSS | Announcements