PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Guangyan Chen; Meiling Wang; Yi Yang; Kai Yu; Li Yuan; Yufeng Yue

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Guangyan Chen, Meiling Wang, Yi Yang, Kai Yu, Li Yuan^*, Yufeng Yue^*

^*Corresponding author for this work

School of Automation

Research output: Contribution to journal › Conference article › peer-review

8 Citations (Scopus)

Abstract

Large language models (LLMs) based on the generative pre-training transformer (GPT) [46] have demonstrated remarkable effectiveness across a diverse range of downstream tasks. Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our method partitions the input point cloud into multiple point patches and arranges them in an ordered sequence based on their spatial proximity. Then, an extractor-generator based transformer decoder [27], with a dual masking strategy, learns latent representations conditioned on the preceding point patches, aiming to predict the next one in an auto-regressive manner. To explore scalability and enhance performance, a larger pre-training dataset is collected. Additionally, a subsequent post-pre-training stage is introduced, incorporating a labeled hybrid dataset. Our scalable approach allows for learning high-capacity models that generalize well, achieving state-of-the-art performance on various downstream tasks. In particular, our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models. Furthermore, our method also attains new state-of-the-art accuracies on all four few-shot learning benchmarks. Codes are available at https://github.com/CGuangyan-BIT/PointGPT.

Original language	English
Journal	Advances in Neural Information Processing Systems
Volume	36
Publication status	Published - 2023
Event	37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States Duration: 10 Dec 2023 → 16 Dec 2023

Cite this

@article{3f10cc7014ba4d6992136a90be12166f,

title = "PointGPT: Auto-regressively Generative Pre-training from Point Clouds",

abstract = "Large language models (LLMs) based on the generative pre-training transformer (GPT) [46] have demonstrated remarkable effectiveness across a diverse range of downstream tasks. Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our method partitions the input point cloud into multiple point patches and arranges them in an ordered sequence based on their spatial proximity. Then, an extractor-generator based transformer decoder [27], with a dual masking strategy, learns latent representations conditioned on the preceding point patches, aiming to predict the next one in an auto-regressive manner. To explore scalability and enhance performance, a larger pre-training dataset is collected. Additionally, a subsequent post-pre-training stage is introduced, incorporating a labeled hybrid dataset. Our scalable approach allows for learning high-capacity models that generalize well, achieving state-of-the-art performance on various downstream tasks. In particular, our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models. Furthermore, our method also attains new state-of-the-art accuracies on all four few-shot learning benchmarks. Codes are available at https://github.com/CGuangyan-BIT/PointGPT.",

author = "Guangyan Chen and Meiling Wang and Yi Yang and Kai Yu and Li Yuan and Yufeng Yue",

note = "Publisher Copyright: {\textcopyright} 2023 Neural information processing systems foundation. All rights reserved.; 37th Conference on Neural Information Processing Systems, NeurIPS 2023 ; Conference date: 10-12-2023 Through 16-12-2023",

year = "2023",

language = "English",

volume = "36",

journal = "Advances in Neural Information Processing Systems",

issn = "1049-5258",

publisher = "Neural information processing systems foundation",

}

TY - JOUR

T1 - PointGPT

T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023

AU - Chen, Guangyan

AU - Wang, Meiling

AU - Yang, Yi

AU - Yu, Kai

AU - Yuan, Li

AU - Yue, Yufeng

PY - 2023

Y1 - 2023

N2 - Large language models (LLMs) based on the generative pre-training transformer (GPT) [46] have demonstrated remarkable effectiveness across a diverse range of downstream tasks. Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our method partitions the input point cloud into multiple point patches and arranges them in an ordered sequence based on their spatial proximity. Then, an extractor-generator based transformer decoder [27], with a dual masking strategy, learns latent representations conditioned on the preceding point patches, aiming to predict the next one in an auto-regressive manner. To explore scalability and enhance performance, a larger pre-training dataset is collected. Additionally, a subsequent post-pre-training stage is introduced, incorporating a labeled hybrid dataset. Our scalable approach allows for learning high-capacity models that generalize well, achieving state-of-the-art performance on various downstream tasks. In particular, our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models. Furthermore, our method also attains new state-of-the-art accuracies on all four few-shot learning benchmarks. Codes are available at https://github.com/CGuangyan-BIT/PointGPT.

AB - Large language models (LLMs) based on the generative pre-training transformer (GPT) [46] have demonstrated remarkable effectiveness across a diverse range of downstream tasks. Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our method partitions the input point cloud into multiple point patches and arranges them in an ordered sequence based on their spatial proximity. Then, an extractor-generator based transformer decoder [27], with a dual masking strategy, learns latent representations conditioned on the preceding point patches, aiming to predict the next one in an auto-regressive manner. To explore scalability and enhance performance, a larger pre-training dataset is collected. Additionally, a subsequent post-pre-training stage is introduced, incorporating a labeled hybrid dataset. Our scalable approach allows for learning high-capacity models that generalize well, achieving state-of-the-art performance on various downstream tasks. In particular, our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models. Furthermore, our method also attains new state-of-the-art accuracies on all four few-shot learning benchmarks. Codes are available at https://github.com/CGuangyan-BIT/PointGPT.

UR - http://www.scopus.com/inward/record.url?scp=85187431586&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85187431586

SN - 1049-5258

VL - 36

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

Y2 - 10 December 2023 through 16 December 2023

ER -

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Abstract

Other files and links

Fingerprint

Cite this