DataTaunew | comments | leaders | submitlogin
Effective Management of High Volume Numeric Data with Histograms (circonus.com)
16 points by Crusso3 96 days ago | 4 comments


1 point by ChevyBar55 91 days ago | link

I hope more helpful comments will come soon.

-----

1 point by SiempreViernes 92 days ago | link

Huh, I guess just taking the log of your data and then doing a normal linear histogram is not cool enough?

If you want true battle tested histogramming, go with root.

-----

1 point by hhartmann 78 days ago | link

That's a good point. You will end up with histograms, that have very similar mathematical properties.

One reason to go with a custom implementation is user expectation. You want your bins to start and end at human readable locations, so that the data can be interpreted more easily. Inserting log(x) into a regular histogram is not going to give you that:

    log_10(x) = 9.0 => x = 10 ** 9.0 = 1.00 e9
    log_10(x) = 9.1 => x = 10 ** 9.1 = 1.25 e9
    log_10(x) = 9.2 => x = 10 ** 9.2 = 1.58 e9
With log-linear histograms you get buckets at:

    1.0 e9
    1.1 e9
    1.2 e9
Disclaimer: I am working for Circonus.

-----

1 point by camel_gopher 91 days ago | link

I'm not familiar with root; is this what you are referring to? https://root.cern.ch/doc/master/classTH1.html

Taking the log of the data would add an extra computational step prior to recording the value I would think. Will need to think about that one a bit more.

-----




RSS | Announcements