D2T: Doubly Distributed Transactions
Project Lead: Jay Lofstead (gflofst@sandia.gov)
This project investigated the potential for using two-phase commit within HPC workflows to offer bracketing of single data sets and to accommodate consistency and correctness determination. The motivation was the observation that an MxN periodic data movement from compute to data staging had no general mechanism for the staging area to determine when a data set was complete, and therefore ready for consumption, and for the compute area to safely delete any copies.
The intial phase demonstrated the possibility of having coordination on both the client and server sides with a single control channel. The revise protocol eliminated the server side requirements, reducing the barrier to use, and greatly exhanced scalability through use of an aggregation tree for gather/scatter operations.
The introduction of fault detection and recovery operations has been completed and tested and submitted for publication.
Source code is available on request and soon linked to this page and on GitHub and in the Trilinos trios capability area.
Publications:
Conference
-
Jay Lofstead, Jai Dayal, Karsten Schwan, Ron Oldfield. "D2T: Doubly Distributed Transactions for High Performance and Distributed Computing". In Proceedings of Cluster Computing. Beijing, China, September 2012. pdf BibTeX
Workshop Papers
- Jay Lofstead, Ivo Jimenez, Carlos Maltzahn. "Consistency and Fault Tolerance Considerations for
the Next Iteration of the DOE Fast Forward Storage
and IO Project". In IASDS @ ICPP 2014. pdf BibTeX
- Jay Lofstead, Jai Dayal, Ivo Jimenez, Carlos Maltzahn. "Efficient Transactions for Parallel Data Movement", In Proceedings of Parallel Data Storage Workshop at Supercompting 2013, Denver, CO. pdf BibTeX
- Jai Dayal, Jianting Cao, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Fang Zheng, Hasan Abbasi, Scott Klasky, Norbert Podhorszki, Jay Lofstead. "I/O Containers: Managing the Data Analytics and Visualization Pipelines of High End Codes", In International Workshop on High Performance Data Intensive Computing at IPDPS 2013, Boston, MA. pdf BibTeX
- Jay Lofstead and Jai Dayal. "Transactional Parallel Metadata Services for Integrated Application Workflows", In Proceedings of High Performance Computing Meets Databases at Supercompting 2012, Salt Lake City, Utah. pdf BibTeX
- Jay Lofstead and Jai Dayal. "Extending MPI to Better Support Multi-Application Interaction.", In Proceedings of IMUDI Workshop at EuroMPI 2012, Vienna, Austria. pdf BibTeX
Posters
- Jay Lofstead, Jai Dayal, Ivo Jimenez, Carlos Maltzahn. "Failure Detection and Recovery for Doubly Distributed Transactions for Parallel and Distributed Computing", In Proceedings of HPDC 2014 Vancouver, BC, Canada. June 2014. pdf (poster) pdf (abstract)
- Ivo Jimenez, Carlos Maltzahn, Jai Dayal, Jay Lofstead. "Exploring Trade-offs in Transactional Parallel Data Movement", In Proceedings of Parallel Data Storage Workshop at Supercomputing 2013, Denver, CO. pdf (poster) pdf (slides)
- Jai Dayal, Jay Lofstead, Karsten Schwan, Ron Oldfield. "D2T: Doubly Distributed Transactions for High Performance and Distributed Computing", In Proceedings of HPDC 2013, New York, New York. pdf (abstract) pdf (poster) pdf (slides)
- Jai Dayal, Jay Lofstead, Karsten Schwan, Ron Oldfield. "D2T: Doubly Distributed Transactions for High Performance and Distributed Computing", In Proceedings of HPDC 2012, Delft, Netherlands. pdf (abstract) pdf (poster)
- Jai Dayal, Jay Lofstead, Karsten Schwan, Ron Oldfield. "Resilient Data Staging Through MxN Distributed Transactions", In Proceedings of Petascale Data Storage Workshop 2011 at Supercomputing 2011, Seattle, Washington. pdf
Back to Top or Home
Last Modified: January 15, 2018