DataTaunew | comments | leaders | submitlogin
2 points by jonan 3448 days ago | link | parent

You're making the hierarchical varieties of clustering look worse than they are by constraining them to k=2.

Ideally, you'd look at the dendrograms and decide which level to cut at. So for the spiral at the bottom you might decide that k=1 is appropriate, and for the 3 blobs you'd decide that 2 or 3 are both acceptable.

Also, why is average linkage paired with 'affinity="cityblock"' ? When using Ward's criterion you're using Euclidean distance, right?




RSS | Announcements