Accelerating Training Convergence for Point Cloud Semantic Segmentation of Large-Scale Urban Scenes with Scene-Ensemble Prototypes

  • Jiawei Han
  • , Kaiqi Liu*
  • , Wei Li
  • , Musen Lin
  • , Wei Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Point cloud semantic segmentation serves as a vital means of remote sensing, with the existing segmentation networks capable of achieving commendable results. However, the complex network architecture and extensive training data often demand substantial time and computational resources for model convergence. This study proposes a novel method to significantly expedite the convergence of point cloud semantic segmentation networks to save computational resources, which is called rapid convergence with scene-ensemble prototypes (RCSP). This method utilizes the knowledge of point clouds from the temporal ensemble of base segmentation network to supervise the training. The knowledge of the temporal-ensemble network is concretized as scene-ensemble prototypes and soft category prediction probabilities. It provides additional constraints beyond category labels, further narrowing the convergence direction of the segmentation network and reaching the optimal solution earlier. Experimental evaluations on large-scale urban scenes (Toronto-3D, SemanticKITTI, ISPRS) and urban indoor environments (S3DIS, ScanNet v2) demonstrate that RCSP achieves model convergence in approximately 30% of the iterations and 45% of the training time required by the baseline network under equivalent GPU memory constraints. Furthermore, the proposed framework delivers substantial improvements in segmentation performance over the baseline.

Original languageEnglish
Article number5700814
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume64
DOIs
Publication statusPublished - 2026
Externally publishedYes

Keywords

  • Point cloud semantic segmentation
  • convergence
  • large-scale urban scenes
  • prototypes

Fingerprint

Dive into the research topics of 'Accelerating Training Convergence for Point Cloud Semantic Segmentation of Large-Scale Urban Scenes with Scene-Ensemble Prototypes'. Together they form a unique fingerprint.

Cite this