Thursday, March 28, 2013

Using ZooKeeper

We are using zookeeper to keep our configuration and when the configuration is updated, we'll be notified and we can update/restart the related module.  To do that, we leverage the watch feature of zookeeper.  And Yan in his blog stated very clear the things to watch for when using ZooKeeper's watch.
"

  1. Watches are one time triggers; if you get a watch event and you want to get notified of future changes, you must set another watch.
  2. Because watches are one time triggers and there is latency between getting the event and sending a new request to get a watch you cannot reliably see every change that happens to a node in ZooKeeper. Be prepared to handle the case where the znode changes multiple times between getting the event and setting the watch again. (You may not care, but at least realize it may happen.)
  3. A watch object, or function/context pair, will only be triggered once for a given notification. For example, if the same watch object is registered for an exists and a getData call for the same file and that file is then deleted, the watch object would only be invoked once with the deletion notification for the file.
"

The problem he is trying to solve here is that when the node has been deleted and recreate right away, we will lost the event for the creation of the new node as both the parent level and the child level events are triggered (node deleted) and before we can registered a new watch, the node had been recreated and the event will be forever lost.

He had also posted a solution for this.  It's pretty straightforward (it's the troubleshooting that's hard),  the watcher for the parent level only deals with 'add' event and each child watcher try to watch the "deleted" node in case it has been recreated.

if you're not familar with scalar, here's the java version


Another thing to watch for when using ZooKeeper is the ZooKeeper object from new ZooKeeper() is not readily usable.  You will get a ConnectionLossException if you try to use it before the connection is established.  To avoid that, you'll have to watch for the connection events and wait for the status KeeperState.SyncConnected (that's event.getState().ordinal() == 3 in the process method).