D2T: Doubly Distributed Transactions

Project Lead: Jay Lofstead (gflofst@sandia.gov)

This project investigated the potential for using two-phase commit within HPC workflows to offer bracketing of single data sets and to accommodate consistency and correctness determination. The motivation was the observation that an MxN periodic data movement from compute to data staging had no general mechanism for the staging area to determine when a data set was complete, and therefore ready for consumption, and for the compute area to safely delete any copies.

The intial phase demonstrated the possibility of having coordination on both the client and server sides with a single control channel. The revise protocol eliminated the server side requirements, reducing the barrier to use, and greatly exhanced scalability through use of an aggregation tree for gather/scatter operations.

The introduction of fault detection and recovery operations has been completed and tested and submitted for publication.

Source code is available on request and soon linked to this page and on GitHub and in the Trilinos trios capability area.



Workshop Papers


Back to Top or Home

Last Modified: January 15, 2018