DataTau | GAM: The Predictive Modeling Silver Bullet

DataTau

	GAM: The Predictive Modeling Silver Bullet (stitchfix.com)
	35 points by astrobiased 3191 days ago \| 18 comments

6 points by pmlandwehr 3191 days ago | link

Right at the start of the article: “Of course, GAM is no silver bullet, but it is a technique you should add to your arsenal.”

-----

4 points by nofreehunch 3190 days ago | link

Look at GA^2M (Lou, Caruana, Gehrke). It is an extension to GAM that allows you to create huge complex ensembles, without sacrificing GAM's intelligibility.

Code & Papers here: http://www.cs.cornell.edu/~yinlou/projects/gam/

Also http://statweb.stanford.edu/~tibs/ElemStatLearn/ has an entire chapter on this algo by one of the authors.

-----

3 points by roycoding 3191 days ago | link

Does anyone know of a decent Python implementation of GAM?

-----

2 points by Cartin1234 3191 days ago | link

Statsmodels pr: https://github.com/statsmodels/statsmodels/pull/2435

also you can whip one up in pymc 3

-----

1 point by jqgatsby 3175 days ago | link

I've seen people doing GLMs in pymc3, but no examples of GAMs yet. Can you provide more guidance?

-----

1 point by plough_jogger 3191 days ago | link

This may work: https://github.com/jcrudy/py-earth

-----

1 point by dpipkin 3191 days ago | link

My first thoughts as well! I'm going to be trying it out in R tonight, but it looks like there are some people working on an implementation in statsmodels and scikit-learn.

-----

2 points by lmc2179 3190 days ago | link

I'm having a little difficulty understanding this - how is this different from kernel regression?

-----

2 points by kblarsen 3189 days ago | link

In a GAM, you are estimating the non-linear functions for all variables in the model simultaneously. Moreover, GAMs allows for smoothing techniques such as regression splines, which allows you to cast GAMs as a large penalized GLM. This has ties to Bayesian regression and mixed effects models.

In a GAM, you are not estimating a bunch of individual smoothers in isolation and then throwing them in a model.

-----

2 points by charlesmartin14 3190 days ago | link

The SVM gamma=0.000001 ?

Seriously?

-----

1 point by kblarsen 3190 days ago | link

I also tried the default setting of gamma=1/(data dimension), as well as many values in between. I also played with the tuning function, but ran out of patience.

-----

2 points by elyase 3191 days ago | link

> From an accuracy standpoint, GAMs are competitive with popular learning techniques such as Random Forest or SVM.

Would be great to get a reference on this. In [1] the authors compared MARS to RF and SVM on several datasets and it didn't look so good.

May be they just got good performance on the one dataset mentioned at the end or did not optimize the parameters of the competing classifiers. I think it is telling that SVM performed worse than a linear classifier.

[1] http://jmlr.org/papers/volume15/delgado14a/delgado14a.pdf

-----

2 points by charlesmartin14 3190 days ago | link

One should not reports results on a method they do not understand how to use. the SVM parameters are non-sensical. This would not pass basic peer review

-----

1 point by kblarsen 3190 days ago | link

I also tried polynomial kernels of order=3 with costs around 0.1, as well as many different gammas for the radial kernel. No luck. As I said, the conversion to probabilities could be the culprit.

However, the predictive performance of SVM is irrelevant to the main points I am trying to make. In other words, even if SVM beat GAM in this single test, it does not invalidate the highlighted benefits of GAM. I would argue that GAM poccesses qualities that SVMs do not, and vice versa.

Feel free to suggest different SVM settings, or a better way to convert classifications into a continuous measure, and I will change the content in the comparison table. The data and code can be downloaded here: https://github.com/klarsen1/gampost.

-----

1 point by am1982 3190 days ago | link

Two things. A GAM is a non-linear model and is quite flexible. The degree of non-linearity is achieved by tuning the spline parameter degrees of freedom as well as introducing tensor splines to get non-linear interactions.

Second, the no free lunch theorms really make papers like the above a lot less telling than you might think. All I really get from them is that RF is a good modeling framework to try but for individual problems maybe try boosting on an SVM or a NN model.

-----

1 point by patwater 3188 days ago | link

silly post title for a useful tool

-----

1 point by larrydag 3190 days ago | link

The article states there is a github link for the source code to the Business Problem. Can anyone find it?

-----

1 point by kblarsen 3190 days ago | link

Forgot to add the link in the post. Here it is:

https://github.com/klarsen1/gampost

-----

RSS | Announcements