Hi one of the curators here.
We started Deep Learning weekly in August of last year and have over 2000 subscribers currently.
Apart from 'newsy' items, we keep an eye out for interesting open source projects and research papers.
Any feedback is very much appreciated.
Hi DataTau! I recently downloaded California's SWITRS database listing all the accidents reported to the police and have started combing through it. This post is my first look trying to answer the question: "When do people crash?"
It covers day of the year and day of the week, I plan to do time later (I'm especially interested in looking at time relative to sunrise/sunset, but that'll take a bit more work).
I've done a number of these data analysis 'tests' as part of interviews for data scientist / analyst roles. They typically vary -- usually (hopefully!) the data set & questions are highly specialized for the company + the role you are applying for. So as for ways to practice.... I'm not sure. If you're skilled in R / Python / Matlab / SQL or whatever the role requires... you should be fine.
1. Be explicit: if you are making an assumption in your code, write why. Write out your logic & reasoning at every step.
2. Watch out for missing / 'bad' data. Write out how you would deal with messy data in real life.
3. Pay attention to the time limit. If you don't know, contact the recruiter to ask what's expected.
4. Interpret your results! If they ask you to, say, find the R^2 correlation between x and y, give them the number and then tell them what that might mean.
Don't stress. Hopefully the 'test' is fun and well thought out. I've always found them extremely helpful in understanding the problems that I would be solving if I were to end up at the company. On the other hand, I've heard that some companies don't put a lot of time into making these assignments relevant, so for me, that would be a warning sign.
I am doing this course at the moment as an introduction to both Data Science and Python. I had some basic Python knowledge before but nothing much but I am still able to do the course with some additional googling.
Hey guys! I just wrote an article on how to integrate RMarkdown with a static Pelican blog that I thought some of you might find interesting.
I've been using Pelican for a few years now to host my blog, but it's always been a bit of a pain converting any R analysis into a finalized blog post. One of my main goals for the year is building out my blog with more content, so I really wanted to streamline things, and I think this should go a long way.
I'm happy to answer any questions if anybody is considering doing something similar!
Garbage finding - there's obviously going to be latent variables related to background and socioeconomic status. This framework is much more likely to replicate some type of ugly profiling that's going on within the justice system rather than a determination of criminality.
I confirm that there is some kind of weird behavior regarding rss feed.
Even when fetched by feedly, I don't get the titles marked as read (by pressing Mark as Read button) untill I click on each title separately.
My rough guess here would be that Feed output is not tuned well according common standards.
Thanks for the note!
I did mention colorblindness in the article.
In case you missed it here is the quote:
"Colors can be hard to distinguish from one another. If you have colors of similar hue, it can be difficult to tell them apart. In addition, you might prevent those who are color blind from understanding what you are communicating."
Can't believe nobody mentioned colourblind readers. I'm not into the political correctness thing at all but if you want to have as broad an audience as possible, this is the way to go.
From "The Elements of Statistical Learning":
Our first edition was unfriendly to colorblind readers; in particular,
we tended to favor red/green contrasts which are particularly troublesome.
We have changed the color palette in this edition to a large
extent, replacing the above with an orange/blue contrast
For this post, we have scraped various signals (e.g. technical maturity, popularity of the library, size of the community behind the library, social media mentions etc.) for more than 50 open source libraries from web.
We have fed all above signals to a trained Machine Learning algorithm to compute a score and rank the top libraries.
I just finished this course. I would say that my skill level in Python would be average, but I was able to handle this course. I thoroughly enjoyed the programming assignments (the last one took me forever to figure out, but it was super rewarding when I finished).
For this post, we have scraped various signals (e.g. online ratings/reviews, topics covered, author social influence in the field, year of publication, social media mentions, etc.) for more than 10 Recommender Systems books from web.
We have fed all above signals to a trained Machine Learning algorithm to compute a score and rank the top recommender systems books.
For this post, we have scraped various signals (e.g. online ratings/reviews, topics covered, author social influence in the field, year of publication, social media mentions, etc.) for more than 10 Data Mining books from web.
We have fed all above signals to a trained Machine Learning algorithm to compute a score and rank the top books.
I went through both for my dissertation but settled on ESL.
It depends on your needs, AISL is fairly mathematical while ESL focuses more on concept - it is better to develop your understanding of the algorithms.
I am a Junior Data Scientist based in London with a strong background in Engineering Sciences.
I have been self-learning Machine Learning and related skills throughout the past year, leveraging my background in Maths and Engineering Sciences.
Here, I share a list of machine learning books that are now on my bookshelf.
Most of these books have a free version available on their website and can be ordered from Amazon. I have included links to relevant HN discussions, as it is how I found out about these books in most of the cases.
The Azure products for data science are _not ready_ for serious usage. Very unstable, lots of fundamental features still to be figured out; such as how can you install a package and not have to install it again next time you log in. No roadmap for improvements.