DataTaunew | comments | leaders | submitlogin
1 point by addmeaning 1 day ago | link | parent | on: THE VEGAN AGENDA AND COWSPIRACY

why one can't downvote or flag post
1 point by owlninja 3 days ago | link | parent | on: THE VEGAN AGENDA AND COWSPIRACY

wat...

Really very helpful article. Thanks for posting.

> For example, in a drug therapy application, you may build a model to predict patient survival as a function of doses of different drugs; you would then use the paper's approach to find the doses that maximize predicted survival.

You mean a random survival forest?

So in this case the doses is the.. controllable independent variable?

I don't get how this work.

> the second step involves solving an optimization problem to find the drug therapy that maximizes the predicted survival of the given patient group subject to a constraint on the predicted toxicity.

Aren't you undoing CART and making it more of a bagging problem which introduce the greedy algorithm problem that CART tries to solve from bagging?

I have to read more into this when I have time but thank you for your work in this field. My thesis is also on tree based algorithm. Always glad to see more paper in trees.


What's the use-case? I can understand why one might want to use Redshift-to-postgres flow, but not other way round.

Nice Post! Thanks for sharing!

Hi Mark,

This post stinks. Suck on an exhaust pipe.

Best, Dean


This is a potential money saver. In the company I work for probably around $300 to $400 a month.

Interesting article, I admire your dedication to data collection! A zero-inflated Poisson model might be more suitable for your data, since there are a lot of days where you didn't find any coins. It assumes the data generating process has two stages: in the first there is a Bernoulli trial with probability of success p. In the case of no success then there are zero observed events. In the second stage, for cases where there was a success in the first stage, the number of events is determined according to a standard Poisson distribution. I think there's an R package to fit this kind of model.

spam, mods please remove this garbage

Please remove this spam link.
1 point by anitadig01 9 days ago | link | parent | on: Outlook Password Recovery Tool

Nice tool..

Yes you are.. :)

Spam.

Hacker News maybe? Not exactly specific to data scientists, but I believe there's a significant number of data scientists lurking around. Same biases, but it seems you're aware of those.

https://news.ycombinator.com/


Yeah, among other biases. "Voluntary response data are useless." I've also tried posting it on /r/datascience but it appears to have not showed up in the sub, maybe because my Reddit account is too new. Do you have suggestions for other DS communities where I could solicit responses?

I think it's more likely than not that you're aware of this, but there's a large selection bias in collecting respondents from DataTau. Those who did not get a job in data science following DS bootcamp are far less likely to be browsing DataTau than those who did.

Who cares? It's not data related, spam!

Hey, I created this survey. The burning question with DS bootcamps seems to be, what are the actual employment prospects after such a short program. So this is more of an "outcome" survey than one on salaries.

I'd like to especially encourage those of you who have not received offers, or who had negative bootcamp experiences, to complete the survey. Your input is valuable and can help others decide on whether to pursue this avenue.

I will of course share the data after collecting enough responses.


I was really hoping this was going to be a stat analysis of Kumble's coaching performance of of when the board fires coaches...

remove this from board please

this looks like spam
2 points by vvmisic 12 days ago | link | parent | on: Optimization of tree ensembles

Hi all -- author of the paper here. In this paper I consider the problem of how to take a tree ensemble model, like a random forest, and find how the independent variables should be set to maximize the predicted value given by the ensemble. For example, in a drug therapy application, you may build a model to predict patient survival as a function of doses of different drugs; you would then use the paper's approach to find the doses that maximize predicted survival.

The methodology is based on mixed-integer optimization and includes results on the strength of the formulation, how to approximate the formulation and how to exploit the formulation structure to obtain solution methods for solving the problem at scale. The numerics include two case studies, one using data from the Merck Molecular Challenge on Kaggle a few years ago, and one using a grocery store scanner data set.

If you have any comments or thoughts, I'd be very interested to hear them. Thank you!


I'm the author of this piece. It describes a very simple experiment in which I counted how much change I picked up for a month, and my analysis of that data. It might be especially interesting if you're interesting in counting rare events or in cases where normal approximations aren't appropriate.

Here's why innovation need not be in the field of tech only and can be from the field of marketing or design.

Hi all,

Thanks for your comments! As Mariselvi suggested here, we added a clean simple visual representation of Airbnb's business model workflow, Hope it makes sense. Thanks.


Good information!!

Blog covers basic airflow of Airbnb business model, Nice!

Enjoying Reading! Its obtain more interesting news about Node.js V8.

Absolutely good one. All the features are described in detail.
More

RSS | Announcements