DataTaunew | comments | leaders | submitlogin

It'll all depend on your budget, of course, but the GTX 1080 or the Titan seems to be a solid choice for deep learning applications.

This article has a lot more about GPU options:

http://timdettmers.com/2017/04/09/which-gpu-for-deep-learnin...

The CPU probably isn't quite so important if you have a good GPU, since the GPU is going to be doing most of the heavy lifting. I ended up getting the Intel core i7.

The amount of RAM you need again depends on your application and the size of your datasets. More is better, but costs money.


I work on an open source data compiler designed to bring the convenience of packaging to data. The goal is to reduce data prep and get data scientists to the analysis faster.

PRs and feature requests are welcome on GitHub: https://github.com/quiltdata/quilt


That's an interesting paper - thanks for sharing!

Actually, in the post I do mention that I choose months for the analysis, but a more granular view can be chosen, like days or weeks... it might be the case that for the business-model you're interested in, it makes more sense to analyze churn in days - and you can do that! :)

Also, (as an example) in telecom operators for pre-paid customers, you don't know when the user churns because there is no subscription. The method I describe allows us to assign a churn date to customers based on inactivity. You can definitely recover customers who have been inactive for 3 months... although it's less likely than recovering customres inactive for 2 months... etc.


What is the usefulness of the probability of a user having churned three months after he last visited the system??? There is a method outlined in the article 'Counting Your Customers the easy way: An alternative to the Pareto/NBD model' based on a similar idea, but with much more useful tools and robust mathematics.

A blogpost in Spanish... yeah sure.

The slides are really interesting, I should check out the video if there's any! Cheers
1 point by weirdtunguska 8 days ago | link | parent | on: Jupyter Notebook 5.0

Really nice features - just need a way to create a document without code and use variables on the markdown.

This is a post on sampling policies for Bayesian optimization by Research Engineer Harvey. He's here to answer any questions if you have them!

Pretty Informative

I applied to one of their programs with a few years industry + undergrad a while ago and made it into the interviews, before my plans changed.

It didn't seem like a very strict requirement, if you were otherwise competitive.


Good luck

Somehow your comment comes across as very innocent and cute :) just don't worry about it man, it wasn't directed at you, it is directed at physicists and to us it makes sense.

We do indeed approach most problems as parametric regression, and usually there are some bounds on the parameter values from outside the data. Sometimes this is called fitting a curve.


Eh?

Curve Fitting is not a term I'm familiar with in Statistic.

It seems like it's some kind of spline or non parametric approach really from what the article is trying to say but at the same time it's saying we know our parameters.

> You gather a set of data, you visualize it, create a fit and build a model around that fit so you can interpolate. Majority of the time, if not every time, you know exactly what parameters are in the dataset as they correspond to some physical event. Building fits help you extract a mathematical equation that will dictate how the event will act in the future given the parameters are the same. Since you know the parameters (and in the event you know how the event was setup), you can tailor your errors and uncertainties more carefully.

No in statistic you don't know the parameter so you're estimating it from the data via spline, kernel density estimation, random forest, etc.. hench it's either semi parametric or non parametric.

So this paragraph contradicts itself.

> Regression is a far more loaded term and has a lot of connections to machine learning. Admittedly, curve fitting also sounds simpler. It’s not. Regression analysis is most commonly used in forecasting and building predictions. It deals with the relationship between the independent variable and the dependent variables and how the dependent variables change when the independent variable is changed.

You can just say that regression is easier to interpret the relationship between predictors and response. Compare to say Random forest or Neural network.

1 point by MarkBallard 17 days ago | link | parent | on: Essay examples

Those photos also show that a 125-foot wide plane supposedly only makes a 16-foot wide hole into a building. Also, three barrier walls into the Pentagon an identical 16-foot wide hole is present, supposedly having nothing to do with witness reports that a second explosion was heard. The only debris around the site was small enough to be carried out by hand. Official reports say that the plane evaporated in the heat of burning jet fuel, even though pictures show other flammable objects (tree branches, for example) intact around the crash site. This is the first time in aviation history that a plane evaporated from a jet fuel fire. Even pictures of the explosion show no plane. The plane could not have evaporated due to jet fuel fires, but it certainly could not have evaporated instantaneously. Apparently, film is a tricky subject. Although the Sheraton Hotel, a near-by gas station, and highway monitoring cameras all caught the impact on video, none of them were released. In fact, as employees of the Sheraton and the gas station sat in shock watching the footage, government officials came in to confiscate the tapes. Officials also had the employees sign wavers of silence. Only five still frames were released from the footage. None of these five frames show a plane. Officials hid what happened at the Pentagon, but based on what they deemed acceptable to show in New York diminishes any lingering confidence in the integrity of our government.

To track your Singapore post packages, you can go https://www.trackingmore.com/singapore-post-tracking.html.

6. Perform home a trepanation with a power drill and shop vac.

As someone that works in R quite a bit, drilling a hole in your head before you try to learn the language is probably your best bet.

1 point by jahan 19 days ago | link | parent | on: Top 16 ML, NLP, and Data Mining Books

For this post, we have scraped various signals (e.g. reviews & ratings, topics covered in the book, author influence in the field, etc.) from web for more than 100 Machine Learning, Data Mining, and NLP books.

We have combined all signals to compute a score for each book and rank the top books.


Amazing...Great initiative

https://www.trackingmore.com/india-post-tracking.html

And it continues to grow!

Help your fellow man and dump your best growth hacking bookmarks today :)


I know very well how hard it is to search for industry-focused conferences or exhibitions. There are so many various sources of information. So many noisy insignificant meet-ups and online webinars! That’s why I’ve conducted the list of the best _offline_ world trade fairs and exhibitions to attend in 2017 — 2018. Places and dates. ~500 pcs in 20 industries. Please check this out and don’t forget to save it for later! Perfect for B2B custdev 🤔

Great article!

Dang postdoc only?

I got myself an undergrad cs and finishing up a master in stat here. Their graph didn't have stat, I guess stat is put under math. Unfortunately both view from math and stat are quite different when you do "data science" and the vocabulary are different too. So I think the distinction between Math (pure, applied) and stat (pure and apply) should be made imo.

Shrug oh well, there are more than one way to become a data science (specially modeler).


Indeed. It's a cool idea, but it would be nice to also address the lack of critical thinking that probably lead to the rise of fake news in the first place. Granted, that's less of a machine learning problem.

How did you make a business case for needing Big Data or Business Intelligence tools in general at your workplace?

Interesting! seems to have borrowed the philosophy of fast.ai of Jeremy Howard. Good luck, will keep an eye on you

That's obviously an entirely different question. Either way, it's happening.

That's exactly what the post is about. And you're right about the betting market - though my money is more on the entertainment industry. Think of the usual early adopters of emerging technology.

As long as there isn't an incentive to know what's true, nobody will agree on what constitutes "fake news." That's why political fake news faces such a problem: people post these stories as a form of tribal signaling, not as a source of information. It's not to hard to make a rough "ideology detector", but a true fake news detector is much harder.

A real fake news detector (where "fake news" is defined to be stories that are demonstrably false in their major point) is more likely to come from a betting market (see Robin Hanson). Look at areas where people are rewarded monetarily for being correct, that is where the action is. I would guess some of the best fake news detectors in the world right now are being developed at banks and hedge funds.

More

RSS | Announcements