DataTaunew | comments | leaders | submitlogin
Suggestion on Improvements of datatau
7 points by adkgupta 3496 days ago | 10 comments
Datatau is more focused towards academic learning providing great materials for learning and development.Would like them to include a seperate section which are specifically dedicated to use case articles.


8 points by izyda 3496 days ago | link

I think there are high quality articles that usually consist of academic articles about computer science/statistics, some announcements of useful open source projects, and interviews/articles by experts renowned in the field.

The frustrating part about DataTau for me is that there are also a huge number of articles that are essentially blog posts, often submitted by the author for self promotion and often consisting of things like "How to be a data scientist" or "Tips every data scientist should know" etc. Some case studies are useful, but many here are to the tune of "I run summary statistics and maybe a basic linear regression on this particular dataset and show pretty graphs".

To me, these articles are, at best, homages to the hype around "big data" and, at worst, are tabloid sounding garbage. They provide little value to anyone but the absolute beginner - and even then, a beginner would likely be better served picking up a book by a well known author in statistics/computer science.

That's not the discredit the value of DataTau - there is some high quality content. Perhaps a solution to the problem could be a tagging feature that allows us to better get access to the content we care about..

-----

5 points by barnoux 3496 days ago | link

A search feature !

-----

3 points by fhadley 3495 days ago | link

So I've been lurking around here for a while-mostly checking in when r/MachineLearning is a bit dead or is having its biweekly computer vision party- but haven't felt the urge to comment on a topic until seeing this. This probably reveals a note of hypocrisy in my following comments, but I thought I'd be transparent.

felipeclopes was entirely right in noting that the real difference between DataTau and HN lies in the strength of the community, as well as in pointing out the positive effects on content quality that are a product of HN's community. izyda rightly took that observation a step further, pointing out the dearth of- excuse my brashness- but garbage articles. At the risk of beating a dead horse, I find it absolutely ridiculous that the front page currently has three articles to the same website from the same user that are either tabloid fodder, junk science, or the kind of big data hype one would expect from a more mainstream news outlet. There's even one on XLMINER (for those who've fought the good fight, incorporating excel documents into a pipeline by stitching together pandas+xlrd+lxml along with healthy amounts of bubble gum and chicken wire, you understand my sentiments). XLMINER.

Ranting aside, there are some excellent academic articles detailing new modeling methods. I'm equally appreciative of the less formal content focused on data science in practice (deeplearning4j and knitr being examples that come immediately to mind) rather than the bleeding edge of machine learning research.

Thus the question to me isn't "What should be improved?"- it's quite clear to me at least that ending the vicious cycle of low quality content and lack of participation is the answer here- but instead, "How do we improve it?" Obviously, that's a bit more complicated. I'm sure it'll prove to be a rather controversial suggestion that's certainly not a panacea, but I think that DataTau would benefit from using a penalty system (here's a good reddit discussion: http://www.reddit.com/r/programming/comments/1qwnuh/how_hack...) similar to the one employed by HN. Instead of relying on downvotes and time decay alone, junk articles (how to be a data scientist, big data hype, anything from datasciencecentral) could be automatically penalized at the time of submission, which will hopefully lead to their eventual disappearance from the site.

Just my .02, and sorry for the rant above.

-----

2 points by joe 3493 days ago | link

Well said. Community obviously is a big part of it. Implementing a penalty rating system may help.

But the real issue is really the combination of a weak community and weak expectations.

On HN, the weight of the community likely outweighs any negative (not intellectually stimulating) submissions. Interesting articles are up voted. Link-bait is down voted (or removed). With fewer people here at DT, we can only give so much weight to the better articles.

The bigger issue though is the weak expectations. HN has outlined guidelines about what to submit (though it is fairly general). I haven't been at HN since the beginning, but I feel its fairly easy to get a sense of what is a good submission and what is not after following the site for a week. That is not true at all here. A new user (or even me, who has been following this site for awhile), has NO IDEA what a good submission is here. Thus, they may find anything even tangentially relevant to data science a submission. Then if it is bad, we lack the community power above.

How to solve these issues? Im not quite sure, but a couple wacky ideas:

1. Only 10-15 articles on the front page. This will give weight to the "better" articles.

2. Approval process for new user submissions; needs to be approved by 5-10 older users.

There is no glue in the community right now. It is too weak and disconnected, which results in a lack of signaling to new users. We need to change that.

-----

3 points by felipeclopes 3495 days ago | link

I think the community should participate better in the posts. The real difference between DataTau and HN is the community, that helps and self select the best articles.

-----

2 points by wejnem 3493 days ago | link

Honestly if people just commented more it would probably improve.

-----

2 points by toast 3496 days ago | link

I'd like it a lot if the twitter link didn't come to this page but rather to the direct article (like HN)

-----

1 point by roycoding 3495 days ago | link

https

-----

1 point by alex 3495 days ago | link

+1 to more discussions. It would be interesting to have a weekly mini journal club where we go over important papers together.

-----

1 point by Gobinath 3495 days ago | link

Should include tagging/topics which can be helpful for finding relevant articles.

-----




RSS | Announcements