Distributed store transactional and data safety guarantees
The commit() action on a Row object from a distributed store by default behaves similarly to an in-memory store's Row object, in that the commit succeeds only if there have been no commits to the Row object since the most recent get() or update() of the Row object.
However, providers can be configured differently. For example, if using BigMemory Max, and the .properties specifies useCompareAndSwap as false then the commit will always succeed, even if another monitor committed a different value for that entry.
Unlike in-memory stores, for Row objects from a distributed store, a Table.get() or Row.update() may return an older value, that is, a previously committed value, even if a more recent commit has completed. This is because a distributed store may perform caching of data. After some undefined time, the get() should be eventually consistent - a later get() or update() of the Row object should retrieve the latest value. Typically, a commit of a Row object where the get() has not retrieved the latest value will flush any local cache of the value, thus the first commit will fail, but a subsequent update and commit will succeed.
Again, providers can be configured differently. For the BigMemory Max driver, setting the terracottaConfiguration.consistency property to STRONG will ensure that after a commit(), a get() on any node will retrieve the latest version. This STRONG consistency mode is more expensive than EVENTUAL consistency.
An example: Monitor1 gets and modifies a row and sends an EPL event to Monitor2 which in response to the event gets and updates the row. In the table below, the event has "overtaken" the change to the row; the effects of changing the row and sending the event are observed in the reverse order (the event is seen before the change to the row)
Time: | Monitor1 (on node 1) | Monitor2 (on node 2) |
1 | Table.get("row1") = "abc" | |
1.2 | Change row to be "abcdef" | Table.get("row1") = "abc" (cached locally) |
1.3 | Row.commit("row1" as "abcdef") succeeds | |
1.301 | Send event to Host 2 | |
1.302 | | Receive event from Host 1 |
1.303 | | Table.get("row1") = "abc" from local cache) |
1.4 | | Update row to be "abcghi" |
1.5 | | Row.commit("row1 as "abcghi") fails (not last value) |
1.6 | | Row.update() = "abcdef" |
1.7 | | Update row to be "abcdefghi" |
1.8 | | Row.commit("row1" as "abcdefghi") succeeds |
At 1.303, an in-memory cache (when two contexts are communicating in the same process) would be guaranteed to retrieve the latest value, "abcdef" - but a distributed store may cache values locally. The commit is guaranteed to fail when a stale value is read, as it does not rely on cached values for checking whether the row is up to date or not.