Technology

Purdue University researchers propose GTX: a transactional graph data system for HTAP workloads

Researchers at Purdue University introduced GTX to address the challenge of processing large graphs with high-throughput read/write transactions while maintaining competitive graph analytics. Efficient management of dynamic graphs is crucial for various applications such as fraud detection, recommendation systems, and training graph neural networks. Real-world graphs often have temporal localities and hotspots that existing transaction graph systems have difficulty dealing with. The research aims to create a transactional graph data system capable of efficiently managing dynamic graphs with high update arrival rates, temporal locations and hotspots, while supporting concurrent graph analysis.

Current transaction graph systems often use coarse-grained concurrency control mechanisms that cannot be optimized to efficiently handle temporal locations and hotspots. These systems can experience performance degradation when faced with concurrent workloads and frequent updates. In contrast, the proposed GTX data system is a latch-free, write-optimized transaction graph data system. GTX leverages atomic operations to eliminate latches, uses delta-based multiversion storage, and implements a hybrid transaction commit protocol.

GTX also includes a delta chain index to support efficient edge lookups and manage concurrency control at the delta chain level. Unlike existing systems, it is designed to adapt to temporal locations and hotspots in graph updates while maintaining high-throughput read/write transactions and competitive graph analysis performance.

GTX’s architecture is based on a latch-free adjacency list-based graph store and a transaction manager with a concurrency control protocol. It uses a multi-version delta store in which each delta captures vertex or edge operations, allowing efficient access and updates. GTX makes it easier to collaborate on concurrent transactions and analytics by controlling them at the delta chain level and using a hybrid group commit protocol. This increases the overall throughput. Additionally, GTX uses a delta chain index for efficient edge lookups and supports adaptive concurrency control based on workload history. The system is prototyped as a chart library and evaluated using real and synthetic data sets. The experiments demonstrate GTX’s ability to process real-world power law graphs with temporal locations and hotspots while maintaining throughput of millions of transactions per second and competitive graph analysis performance.

In summary, researchers address the challenge of efficiently managing dynamic graphs with high update arrival rates, temporal locations, and hotspots. By introducing GTX, a latch-free, write-optimized transaction graph data system, researchers provide a solution that outperforms existing systems in terms of transaction throughput and robustness under various workloads. GTX’s ability to adapt to temporal locations and hotspots while maintaining competitive graph analysis performance makes it a promising tool for applications that require efficient graph management and analysis.


Visit the Paper. All credit for this research goes to the researchers of this project. Also don’t forget to follow us Twitter. Join our… Telegram channel, Discord channelAnd LinkedIn Grupp.

If you like our work, you will love ours Newsletter..

Don’t forget to join our 41k+ ML SubReddit


Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology (IIT), Kharagpur. She is a technology enthusiast and has a keen interest in the areas of software and data science applications. She always reads about developments in various areas of AI and ML.




Source link