Cosmos partitioning

MSDN says that Azure Cosmos DB Partition Key (PR) should have a high cardinally. It is so because it works like a dictionary (under the hood). It calculates a hash for each PK value and build-up logical distribution. It is similar to .net dictionary which builds an array and distributes items by their hash, a hash uses like an index in the array.

items[hash(item)]

After the hash-index match, it looks up through the items one-by-one until find the matched one. This collection called a logical partition. It is placed inside one big physical partition. Each logical partition can store 20 Gb of data. So, both the cosmos DB and the .net dictionary are able to immediately to find a document by its PK.

And this is the answer why the cosmos DB allows to use JS-procedures only within one single partition. This is so because it is much more efficient to run such over the already founded collection of items. The cosmos doesn’t need to find another logic partition by PK-index and consume threads to bring it into the proc. It’s like a miner in a mine 😊.

From the other hand, as you understand, if we want to select a single document or a batch of documents, it is much more efficient to use PK as the index. E.g. if I want to get all documents or all documents by a condition, it means I would prefer to run a batch of threads and gather all the documents. However, if you mostly use conditions like

select * from c where c.city='NY' and ...

then yes, the more efficient would be set the source like a PK. Because it allows the cosmos DB immediately find the logical partition and after that walk through it and look for a condition. But if we mostly search for the different conditions once by create_date, once by created_by and so on. It makes sense to keep an id as a PK.

This is true for all small collections (less than 100Gb). These collections will be always in a single physical partition. In the case of the huge amount of data, it is necessary to optimise the cosmos DB in the way to have one partition = 100 Gb.

As a resume: if your collection is less than 10Gb it doesn’t really matter what property you will use for a PK. But think about future size. And prepare your design for scaling.

P. S. use an ID by default


Useful links:

Read other Azure related articles by tag Azure

Leave a Reply