Druid Segment Diskspace Calculator

Recently i have been working with Druid & was trying to come up with the disk space sizing on the historical nodes, as we have to deploy onto remote customer locations, for which we need to come up with machine requests way in advance.

dc

This took me into the world of bitmaps : concise and roaring.

Druid uses concise bitmaps by default and has the option for roaring too.

References:

So after reading a bit i decided to come up with a calculator for Druid Segment Sizing needed on the druid HISTORICAL nodes assuming Concise is used.

You can find the calculator  @ github.com/jaihind213/druid-calculator

PS: Initially i wanted to write a blog post explaining the logic behind my calculator, but was too lazy, so wrote code and included the rationale in the code comments. 🙂

Thank you and hope you find it useful. Feedback is welcome.

 

 

 

 

My Attempt to demystify Data Stores.

Recently, i gave a presentation at Singapore Py-DataMeetup on  “Demystifying Datastores”.

Screen Shot 2017-03-15 at 11.33.54 AM.png

The main motivation of the presentation is to help one understand the various facets of a datatstore and how these facets can help you decide which one to use.

here is the link