DataTaunew | comments | leaders | submitlogin
1 point by vikp 3082 days ago | link | parent

On the package installation front, the "best case" install is always great, but there are strange platform and other inconsistencies that can cause hard to debug issues. Just google "xgboost installation error" to see.

In our experience helping people new to data science get started, package installation is a non-trivial hurdle. Docker has helped reduce the number of error cases substantially. At the very least, it reduces the number of installation issues to debug to 1 -- Docker itself. Docker is also evolving rapidly, and you may have used a previous version.

As for the making containers sound better than they are / not having a thorough understanding, this article is targeted to those new to Docker, and makes some simplifications. If you want to highlight specific inaccuracies, would love to discuss, but this comes across as FUD if not.



1 point by grahama 3082 days ago | link

Oh sorry, can definitely see how it came across like that in retrospect. In terms of docker being easier or better than VM's, I think this is the best explanation (although it is maybe outdated even though it's only 1 year old) of how people often misunderstand what docker or vagrant are: http://stackoverflow.com/a/21314566/4696622 I've used both and I generally stick with vagrant because unless you are already on a linux machine, you will need to use a VM anyways at which point the whole 'startup time' argument is pretty much equivalent and an actual VM is a lot more useful for anything dev/toy stuff than a mishap of docker containers (well from MY experience of course).

I think both are cool and I'm sure someone much smarter than me can explain when to use one over the other but there really are a lot of people who discuss the pro's or con's but have little in-depth experience with either (not saying that about this, more aimed at HN comments I've seen). I also think knowing how to configure virtualenv's/conda is important because that is becoming a really fundamental part of python understanding (venv, virtualenv, whatever else options there currently are).

-----




RSS | Announcements