Learning Multi-Agent Reservoir Cooperative Operations Over Multi-Relational Directed Acyclic Graph

Research output: Contribution to journalArticlepeer-review

Abstract

Operating large multi-reservoir systems is critical for effective water allocation, hydropower generation, and economic development. However, conventional robust planning-based methods scale poorly beyond single-reservoir or cascaded systems due to computational intractability. Addressing inter-reservoir relation extraction and coordination under conflicting objectives, uncertain inflows, and complex network couplings therefore remains an open challenge. To tackle this problem, we introduce the multi-agent reservoir cooperative operation (MARCO) environment, which integrates multiple objectives, heterogeneous topology, and stochastic inflows in a unified framework. An algorithm is then designed to construct a multi-relational directed acyclic graph (MR-DAG) that encodes the underlying topology through coupled objectives and entity relations. Building on this representation, we propose the multi-agent relational directed acyclic graph transformer (MAR-DAGT), a reinforcement learning algorithm that performs typed message passing for efficient feature extraction and employs acyclic decision-making to exploit causal structure for improved credit assignment. Extensive experiments on MARCO show that MAR-DAGT consistently outperforms other MARL and optimization-based baselines in terms of objective satisfaction and robustness to inflow uncertainty. Note to Practitioners—This paper investigates the cooperative operations in multi-reservoir water allocation networks. Traditionally, this problem has been addressed through planning-based methods that offer robustness guarantees. However, these methods face significant limitations as the number of reservoir stakeholders increases, resulting in exponential growth in problem complexity. With the growing demand for high-temporal granularity scheduling and refined reservoir management strategies, the number of involved stakeholders continues to rise, further complicating the network topology. To overcome these challenges, we propose a learning-based operational method for multi-reservoir systems. The proposed approach effectively uncovers the underlying network topology and captures causal relationships among multiple objectives within the reservoir network under inflow uncertainties. Experimental evaluations conducted on synthetic reservoir cooperation scenarios demonstrate that the proposed method surpasses traditional planning-based approaches and existing learning-based solutions across several performance metrics, including civilian water supply, hydropower generation, and ecological conservation. Moreover, the sensitivity analysis conducted under out-of-distribution inflow noise empirically examines the stability of the proposed approach. Although the proposed method has not yet been validated in real-world reservoir systems, it shows promising potential to enhance the economic and ecological efficiency of large-scale multi-reservoir management practices. Future research will focus on extending the method to accommodate dynamic topologies in water allocation networks.

Original languageEnglish
Pages (from-to)846-858
Number of pages13
JournalIEEE Transactions on Automation Science and Engineering
Volume23
DOIs
Publication statusPublished - 2026
Externally publishedYes

Keywords

  • Multi-reservoir operations
  • directed acyclic graph
  • heterogeneous graph
  • multi-agent reinforcement learning
  • uncertainty

Fingerprint

Dive into the research topics of 'Learning Multi-Agent Reservoir Cooperative Operations Over Multi-Relational Directed Acyclic Graph'. Together they form a unique fingerprint.

Cite this