Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning

Chiya Zhang, Shiyuan Liang, Chunlong He, Kezhi Wang

Research output: Contribution to journalArticlepeer-review


In this paper, multi-unmanned aerial vehicle (multi-UAV) and multi-user system are studied, where UAVs are served as aerial base stations (BS) for ground users in the same frequency band without knowing the locations and channel parameters for the users. We aim to maximize the total throughput for all the users and meet the fairness requirement by optimizing the UAVs’ trajectories and transmission power in a centralized way. This problem is non-convex and very difficult to solve, as the locations of the user are unknown to the UAVs. We propose a deep reinforcement learning (DRL)-based solu-tion, i.e., soft actor-critic (SAC) to address it via modeling the problem as a Markov decision process (MDP). We carefully design the reward function that combines sparse with non-sparse reward to achieve the balance between exploitation and exploration. The simulation results show that the proposed SAC has a very good performance in terms of both training and testing.

Original languageEnglish
Pages (from-to)192-201
Number of pages10
JournalJournal of Communications and Information Networks
Issue number2
Publication statusPublished - 1 Jun 2022


Dive into the research topics of 'Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this