Kuan

December 30, 2022

CouchDB: My open_revs confusion

New day, new confusion. Today solved puzzle is the `open_revs` query parameters used in a GET /{db}/{doc}.

The official and correct definition for it is as follows: `open_revs` takes an array of leaf revision string, and return all of their corresponding document in the response. It is correct, but again, expecting the readers are CouchDB expert and seasoned replicator developer. 

In case you missed my last post, I am a beginner CouchDB replicator developer. Meaning, a newbie in writing a replicator client. This breed is rare, but we exist!

So, I learned today, `open_revs` expects me to provide the exact revision string for each leaf revisions. Leaf revisions are the latest entry of each revision trees for the document. e.g. when there are conflicts, like 3 different tree, leaf revisions are the latest rev for each of these conflicting trees. Very simple concept, really. Just I was ignoring the term "leaf" for too long, and skipped the "term and definition" part in the long Replication Protocol documentation.

I wrote a simple test to create a document with four editions. Result in a single tree with four revisions. Then I passed these revisions into `open_revs` and got back a response with four different documents. The test passed, and I thought it is as expected. Then my next test is the same one but added one extra `latest=true` in the query parameters. Then my reality was broken by the "weird" response.

Adding `latest=true` made the response only return the latest document. It is the left revision of the document. Where's all the others? I got them without this extra query parameters.

Well, the official documentation for `latest=true` is as follows: always return the leaf revision, regardless of the one provided in `open_revs`. It was an option to avoid a race condition when replicating. I ignored the latter part about the "race condition". I really should not.

Here's my mistake: `open_revs` asked for leaf revisions, but I gave the revisions belong to the same tree, and it returns all of them, then I thought that was correct. Passing in `latest=true` and return just the latest revision of the tree is correct too. I just got confused by the first behavior which it returns all the revisions of the same tree with the `latest=true`. This is probably a bug, but no one from CouchDB noticed it because they never expect a noob to pass in all the revisions from the same tree.

My bigger mistake? I am still writing wrong tests, and testing method that should not be tested standalone. I need to first convincingly and conveniently create real conflicts in my testing CouchDB.

About Kuan

Web developer building with Flutter, Svelte and JavaScript. Recently fell in love with functional programming.

Malaysian. Proud Sabahan. Ex game developer but still like playing games.

New found hobby is outdoor camping with my love.