DataTaunew | comments | leaders | submitlogin
1 point by grahama 3082 days ago | link | parent

xgboost is a messy install?

  git clone https://github.com/dmlc/xgboost
  cd xgboost
  bash build.sh
  cd python-package
  pip3 install -e .

and fwiw, I don't think if you are having troubles with virtualenv's and issues that docker is going to be any easier. Both are annoying and frustrating starting out. Also reading this post, seem's like theres a lot of not entirely true statements that make containers sound better than they are or not have a thorough understanding of what docker actually is.


1 point by bkd9 3074 days ago | link

Huh, I just tried this and it worked. This was not my experience in August-- I spent hours installing xgb and one member of my team never got it to work. It looks like xgb has matured a lot since I first tried it, and has become much more accessible, which is great!

I stand by my point that some libraries are more difficult to install than a quick pip. I just got through installing vispy, which requires a backend that was tricky to get working.

-----

1 point by vikp 3082 days ago | link

On the package installation front, the "best case" install is always great, but there are strange platform and other inconsistencies that can cause hard to debug issues. Just google "xgboost installation error" to see.

In our experience helping people new to data science get started, package installation is a non-trivial hurdle. Docker has helped reduce the number of error cases substantially. At the very least, it reduces the number of installation issues to debug to 1 -- Docker itself. Docker is also evolving rapidly, and you may have used a previous version.

As for the making containers sound better than they are / not having a thorough understanding, this article is targeted to those new to Docker, and makes some simplifications. If you want to highlight specific inaccuracies, would love to discuss, but this comes across as FUD if not.

-----

1 point by grahama 3081 days ago | link

Oh sorry, can definitely see how it came across like that in retrospect. In terms of docker being easier or better than VM's, I think this is the best explanation (although it is maybe outdated even though it's only 1 year old) of how people often misunderstand what docker or vagrant are: http://stackoverflow.com/a/21314566/4696622 I've used both and I generally stick with vagrant because unless you are already on a linux machine, you will need to use a VM anyways at which point the whole 'startup time' argument is pretty much equivalent and an actual VM is a lot more useful for anything dev/toy stuff than a mishap of docker containers (well from MY experience of course).

I think both are cool and I'm sure someone much smarter than me can explain when to use one over the other but there really are a lot of people who discuss the pro's or con's but have little in-depth experience with either (not saying that about this, more aimed at HN comments I've seen). I also think knowing how to configure virtualenv's/conda is important because that is becoming a really fundamental part of python understanding (venv, virtualenv, whatever else options there currently are).

-----




RSS | Announcements