Cloud-Edge-End Collaborative Inference in Mobile Networks: Challenges and Solutions

Xixi Zheng, Weiting Zhang, Chenfei Hu, Liehuang Zhu, Chuan Zhang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Applying artificial intelligence (AI) to mobile networks can significantly enhance inference services, driving the rapid evolution towards smarter, more efficient networks. As next-generation mobile networks are deployed, they are expected to provide higher data rates and lower latency, which presents challenges for traditional large-model inference. These challenges necessitate that inference architectures and models quickly adapt to meet growing demands and handle increasingly complex tasks. To tackle these issues, the cloud-edge-end collaborative inference framework has emerged as a highly effective solution. This framework facilitates intelligent collaboration across the cloud, edge, and end devices, optimizing computational resources at all levels. Building on this approach, this article proposes a collaborative inference system specifically designed for mobile networks, addressing critical issues like computational bottlenecks and inference latency in traditional model deployments. The system integrates an innovative task offloading strategy and advanced model acceleration techniques, resulting in reduced inference latency. Experimental results show that the system not only significantly lowers latency but also maintains high-quality output. Finally, the article discusses future development trends in cloud-edge-end collaborative inference systems.

Original languageEnglish
JournalIEEE Network
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • artificial intelligence
  • Cloud-edge-end collaboration
  • collaborative inference
  • large models
  • mobile network

Cite this