Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation

Kaixin Cui; Jiliang Song; Lei Zhang; Ying Tao; Wei Liu; Dawei Shi

doi:10.1109/TAES.2022.3231239

Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation

Kaixin Cui, Jiliang Song, Lei Zhang, Ying Tao, Wei Liu, Dawei Shi^*

^*Corresponding author for this work

School of Automation

Research output: Contribution to journal › Article › peer-review

16 Citations (Scopus)

Abstract

In this work, we investigate the problem of multisatellite resource allocation for expected long-term performance optimization with a dynamic task network model, where communication tasks generated by task satellites are expected to be transmitted by resource satellites in the application layer, and the set of tasks changes with satellite orbital motions. The features of the tasks include priority, execution duration, visible time, etc. Since the feature information has a high dimension and changes with time, the scheduling problem is formulated as a dynamic combinatorial optimization problem and a receding-horizon task scheduling algorithm based on the event-triggered deep reinforcement learning is proposed. A residual-fully connected network is designed to extract the features of the complex task network model, and a deep double Q-learning iteration with the experience replay memory mechanism is employed to change the allocation strategy by evaluated rewards adaptively. An event-triggered strategy is then proposed to handle urgent tasks online. Numerical simulations show the performance improvement of the proposed algorithm. For the scenario of 50 task satellites and ten resource satellites, the proposed algorithm achieves 4.1%, 5.9%, and 11.4% higher reward scores than the static deep reinforcement learning algorithm, the data-driven parallel scheduling algorithm, and the improved genetic algorithm, respectively. The computation time of the proposed algorithm is only 34.7% and 21.3% of that of the latter two algorithms, and is similar to that of the static deep reinforcement learning algorithm.

Original language	English
Pages (from-to)	3766-3777
Number of pages	12
Journal	IEEE Transactions on Aerospace and Electronic Systems
Volume	59
Issue number	4
DOIs	https://doi.org/10.1109/TAES.2022.3231239
Publication status	Published - 1 Aug 2023

Keywords

Dynamic combinatorial optimization
event-triggered deep reinforcement learning
receding-horizon optimization
residual-fully connected network
resource allocation

Access to Document

10.1109/TAES.2022.3231239

Cite this

@article{24b7718669df4091856c619393e93365,

title = "Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation",

abstract = "In this work, we investigate the problem of multisatellite resource allocation for expected long-term performance optimization with a dynamic task network model, where communication tasks generated by task satellites are expected to be transmitted by resource satellites in the application layer, and the set of tasks changes with satellite orbital motions. The features of the tasks include priority, execution duration, visible time, etc. Since the feature information has a high dimension and changes with time, the scheduling problem is formulated as a dynamic combinatorial optimization problem and a receding-horizon task scheduling algorithm based on the event-triggered deep reinforcement learning is proposed. A residual-fully connected network is designed to extract the features of the complex task network model, and a deep double Q-learning iteration with the experience replay memory mechanism is employed to change the allocation strategy by evaluated rewards adaptively. An event-triggered strategy is then proposed to handle urgent tasks online. Numerical simulations show the performance improvement of the proposed algorithm. For the scenario of 50 task satellites and ten resource satellites, the proposed algorithm achieves 4.1%, 5.9%, and 11.4% higher reward scores than the static deep reinforcement learning algorithm, the data-driven parallel scheduling algorithm, and the improved genetic algorithm, respectively. The computation time of the proposed algorithm is only 34.7% and 21.3% of that of the latter two algorithms, and is similar to that of the static deep reinforcement learning algorithm.",

keywords = "Dynamic combinatorial optimization, event-triggered deep reinforcement learning, receding-horizon optimization, residual-fully connected network, resource allocation",

author = "Kaixin Cui and Jiliang Song and Lei Zhang and Ying Tao and Wei Liu and Dawei Shi",

note = "Publisher Copyright: {\textcopyright} 1965-2011 IEEE.",

year = "2023",

month = aug,

day = "1",

doi = "10.1109/TAES.2022.3231239",

language = "English",

volume = "59",

pages = "3766--3777",

journal = "IEEE Transactions on Aerospace and Electronic Systems",

issn = "0018-9251",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

TY - JOUR

T1 - Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation

AU - Cui, Kaixin

AU - Song, Jiliang

AU - Zhang, Lei

AU - Tao, Ying

AU - Liu, Wei

AU - Shi, Dawei

PY - 2023/8/1

Y1 - 2023/8/1

N2 - In this work, we investigate the problem of multisatellite resource allocation for expected long-term performance optimization with a dynamic task network model, where communication tasks generated by task satellites are expected to be transmitted by resource satellites in the application layer, and the set of tasks changes with satellite orbital motions. The features of the tasks include priority, execution duration, visible time, etc. Since the feature information has a high dimension and changes with time, the scheduling problem is formulated as a dynamic combinatorial optimization problem and a receding-horizon task scheduling algorithm based on the event-triggered deep reinforcement learning is proposed. A residual-fully connected network is designed to extract the features of the complex task network model, and a deep double Q-learning iteration with the experience replay memory mechanism is employed to change the allocation strategy by evaluated rewards adaptively. An event-triggered strategy is then proposed to handle urgent tasks online. Numerical simulations show the performance improvement of the proposed algorithm. For the scenario of 50 task satellites and ten resource satellites, the proposed algorithm achieves 4.1%, 5.9%, and 11.4% higher reward scores than the static deep reinforcement learning algorithm, the data-driven parallel scheduling algorithm, and the improved genetic algorithm, respectively. The computation time of the proposed algorithm is only 34.7% and 21.3% of that of the latter two algorithms, and is similar to that of the static deep reinforcement learning algorithm.

AB - In this work, we investigate the problem of multisatellite resource allocation for expected long-term performance optimization with a dynamic task network model, where communication tasks generated by task satellites are expected to be transmitted by resource satellites in the application layer, and the set of tasks changes with satellite orbital motions. The features of the tasks include priority, execution duration, visible time, etc. Since the feature information has a high dimension and changes with time, the scheduling problem is formulated as a dynamic combinatorial optimization problem and a receding-horizon task scheduling algorithm based on the event-triggered deep reinforcement learning is proposed. A residual-fully connected network is designed to extract the features of the complex task network model, and a deep double Q-learning iteration with the experience replay memory mechanism is employed to change the allocation strategy by evaluated rewards adaptively. An event-triggered strategy is then proposed to handle urgent tasks online. Numerical simulations show the performance improvement of the proposed algorithm. For the scenario of 50 task satellites and ten resource satellites, the proposed algorithm achieves 4.1%, 5.9%, and 11.4% higher reward scores than the static deep reinforcement learning algorithm, the data-driven parallel scheduling algorithm, and the improved genetic algorithm, respectively. The computation time of the proposed algorithm is only 34.7% and 21.3% of that of the latter two algorithms, and is similar to that of the static deep reinforcement learning algorithm.

KW - Dynamic combinatorial optimization

KW - event-triggered deep reinforcement learning

KW - receding-horizon optimization

KW - residual-fully connected network

KW - resource allocation

UR - http://www.scopus.com/inward/record.url?scp=85146214782&partnerID=8YFLogxK

U2 - 10.1109/TAES.2022.3231239

DO - 10.1109/TAES.2022.3231239

M3 - Article

AN - SCOPUS:85146214782

SN - 0018-9251

VL - 59

SP - 3766

EP - 3777

JO - IEEE Transactions on Aerospace and Electronic Systems

JF - IEEE Transactions on Aerospace and Electronic Systems

IS - 4

ER -

Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this