DataTaunew | comments | leaders | submitlogin
3 points by kiyoto 3337 days ago | link | parent

Disclaimer: I am an active R user who once wrote a lifetime worth of Python in finance.

First, I am with you: R's syntax is idiosyncratic and rather horrible in places. Python is much easier on the eyes and more consistent syntactically. Also, dealing with strings is confusing at best in R and piece of cake in Python.

That said, R's semantics is pretty powerful, and ggplot2/dplyr (or any R library by Hadley Wickham) takes full advantage of R's expressiveness. The ">%>" operator is surely ugly, but as far as I know (happy to be proven wrong), that kind of operator is not even implementable in Python.



1 point by rlayton 3337 days ago | link

(Never used R) What does >%> do? My best guess would be that is performs the modulo operation, but that doesn't fit with your comment.

-----

4 points by kiyoto 3337 days ago | link

It's the "pipe" operation. So, if you do

data %>% group_by(column)

That's the same as

group_by(data, column)

Essentially, this allows computations to be written with few nesting like

data %>% group_by(column) %>% summarise(f = length(another_c0lumn) %>% filter(f > 20)

A similar idea in other languages is method-chaining, which is what pandas does to implement something similar.

I personally like "%>%" better than method-chaining, probably because I think more functionally than OOP. But I now feel like I am opening a different can of worms.

-----

1 point by ubercode5 3336 days ago | link

I am with you there, piping is a very powerful operation and makes more sense from a purely functional perspective.

Method chaining isn't too terrible, but it also means those functions need to be attached to the object, which makes it rigid to reusably extend if you aren't the author. Maybe we should petition the python community for piping :).

The even more ugly option would be function nesting a(b(c(data))), which feels like reading reverse polish notation..

-----




RSS | Announcements