Distributed store transactional and data safety guarantees
The commit() action on a Row object from a distributed store by default behaves similarly to an in-memory store's Row object, in that the commit succeeds only if there have been no commits to the Row object since the most recent get() or update() of the Row object.
However, providers can be configured differently. For example, if using TCStore or BigMemory Max, and the .properties specifies useCompareAndSwap as false, then the commit will always succeed, even if another monitor committed a different value for that entry.
Unlike in-memory stores, for Row objects from a distributed store, a Table.get() or Row.update() may return an older value, that is, a previously committed value, even if a more recent commit has completed. This is because a distributed store may perform caching of data. After some undefined time, the get() should be eventually consistent; a later get() or update() of the Row object should retrieve the latest value. Typically, a commit of a Row object where the get() has not retrieved the latest value will flush any local cache of the value, thus the first commit will fail, but a subsequent update and commit will succeed.
Again, providers can be configured differently. For the BigMemory Max driver, setting the terracottaConfiguration.consistency property to STRONG will ensure that after a commit(), a get() on any node will retrieve the latest version. This STRONG consistency mode is more expensive than EVENTUAL consistency.
An example: Monitor1 gets and modifies a row and sends an EPL event to Monitor2 which in response to the event gets and updates the row. In the table below, the event has overtaken the change to the row; the effects of changing the row and sending the event are observed in the reverse order (the event is seen before the change to the row).
Time | Monitor1 (on node 1) | Monitor2 (on node 2) |
1 | Table.get("row1") = "abc" | |
1.2 | Change row to be "abcdef" | Table.get("row1") = "abc" (cached locally) |
1.3 | Row.commit("row1" as "abcdef") succeeds | |
1.301 | Send event to node 2 | |
1.302 | | Receive event from node 1 |
1.303 | | Table.get("row1") = "abc" from local cache) |
1.4 | | Update row to be "abcghi" |
1.5 | | Row.commit("row1 as "abcghi") fails (not last value) |
1.6 | | Row.update() = "abcdef" |
1.7 | | Update row to be "abcdefghi" |
1.8 | | Row.commit("row1" as "abcdefghi") succeeds |
At 1.303, an in-memory cache (when two contexts are communicating in the same process) would be guaranteed to retrieve the latest value, "abcdef", but a distributed store may cache values locally. The commit is guaranteed to fail when a stale value is read, as it does not rely on cached values for checking whether the row is up to date or not.
Different providers may have other differences in behavior. In particular, they may differ in whether or not they use referential checking, that is, if one client reads a row, and meanwhile the row is modified but then modified back to the first value, the client with the old Row object may or may not succeed in performing a commit(). Some providers require the row to not have changed since the row was read (even if then changed back to the value at the point it was read), while others will only compare the contents of the row.
Also avoid relying on comparisons based on the float values NaN and -0.0. Some providers may treat NaN as not equal to any value, including a NaN, which will result in the commit() method never being able to complete. Providers may differ in whether 0.0 and -0.0 are treated as the same value or not. Consider this to be undefined behavior.