Context-aware Transformer Model for Crowd Localization

Yiming Gong; Kan Li

doi:10.1109/CVIDLICCEA56201.2022.9824361

Context-aware Transformer Model for Crowd Localization

Yiming Gong, Kan Li

计算机学院

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

1 引用（Scopus）

摘要

Because crowd density varies greatly in real scenes, detection-based methods are less reliable in crowded areas. Existing methods of applying detection-based transformer models to complete crowd localization are also subject to the same constraints. Moreover, there are many small targets in the scene of dense crowds, which is even more obvious. To address this issue, our model employs context-aware module to extract information that fuses different scales, thereby addressing the potential rapid scale change, and uses transformer to build an end-to-end crowd localization model. Extensive experiments show that our model adaptively learns contextual information for crowd localization, significantly outperforming previous more advanced models.

源语言	英语
主期刊名	2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022
出版商	Institute of Electrical and Electronics Engineers Inc.
页	199-202
页数	4
ISBN（电子版）	9781665459112
DOI	https://doi.org/10.1109/CVIDLICCEA56201.2022.9824361
出版状态	已出版 - 2022
活动	3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022 - Virtual, Changchun, 中国期限: 20 5月 2022 → 22 5月 2022

出版系列

姓名	2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022

会议

会议	3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022
国家/地区	中国
市	Virtual, Changchun
时期	20/05/22 → 22/05/22

访问文件

10.1109/CVIDLICCEA56201.2022.9824361

其它文件与链接

链接到 Scopus 的出版物

引用此

Gong, Y., & Li, K. (2022). Context-aware Transformer Model for Crowd Localization. 在 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022 (页码 199-202). (2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CVIDLICCEA56201.2022.9824361

Gong, Yiming ; Li, Kan. / Context-aware Transformer Model for Crowd Localization. 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022. Institute of Electrical and Electronics Engineers Inc., 2022. 页码 199-202 (2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022).

@inproceedings{67b131013309447dbb3d41b2b7527878,

title = "Context-aware Transformer Model for Crowd Localization",

abstract = "Because crowd density varies greatly in real scenes, detection-based methods are less reliable in crowded areas. Existing methods of applying detection-based transformer models to complete crowd localization are also subject to the same constraints. Moreover, there are many small targets in the scene of dense crowds, which is even more obvious. To address this issue, our model employs context-aware module to extract information that fuses different scales, thereby addressing the potential rapid scale change, and uses transformer to build an end-to-end crowd localization model. Extensive experiments show that our model adaptively learns contextual information for crowd localization, significantly outperforming previous more advanced models.",

keywords = "Crowd counting, Crowd localization, transformer",

author = "Yiming Gong and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022 ; Conference date: 20-05-2022 Through 22-05-2022",

year = "2022",

doi = "10.1109/CVIDLICCEA56201.2022.9824361",

language = "English",

series = "2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "199--202",

booktitle = "2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022",

address = "United States",

}

Gong, Y & Li, K 2022, Context-aware Transformer Model for Crowd Localization. 在 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022. 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022, Institute of Electrical and Electronics Engineers Inc., 页码 199-202, 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022, Virtual, Changchun, 中国, 20/05/22. https://doi.org/10.1109/CVIDLICCEA56201.2022.9824361

Context-aware Transformer Model for Crowd Localization. / Gong, Yiming; Li, Kan.
2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022. Institute of Electrical and Electronics Engineers Inc., 2022. 页码 199-202 (2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Context-aware Transformer Model for Crowd Localization

AU - Gong, Yiming

AU - Li, Kan

PY - 2022

Y1 - 2022

N2 - Because crowd density varies greatly in real scenes, detection-based methods are less reliable in crowded areas. Existing methods of applying detection-based transformer models to complete crowd localization are also subject to the same constraints. Moreover, there are many small targets in the scene of dense crowds, which is even more obvious. To address this issue, our model employs context-aware module to extract information that fuses different scales, thereby addressing the potential rapid scale change, and uses transformer to build an end-to-end crowd localization model. Extensive experiments show that our model adaptively learns contextual information for crowd localization, significantly outperforming previous more advanced models.

AB - Because crowd density varies greatly in real scenes, detection-based methods are less reliable in crowded areas. Existing methods of applying detection-based transformer models to complete crowd localization are also subject to the same constraints. Moreover, there are many small targets in the scene of dense crowds, which is even more obvious. To address this issue, our model employs context-aware module to extract information that fuses different scales, thereby addressing the potential rapid scale change, and uses transformer to build an end-to-end crowd localization model. Extensive experiments show that our model adaptively learns contextual information for crowd localization, significantly outperforming previous more advanced models.

KW - Crowd counting

KW - Crowd localization

KW - transformer

UR - http://www.scopus.com/inward/record.url?scp=85135404594&partnerID=8YFLogxK

U2 - 10.1109/CVIDLICCEA56201.2022.9824361

DO - 10.1109/CVIDLICCEA56201.2022.9824361

M3 - Conference contribution

AN - SCOPUS:85135404594

T3 - 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022

SP - 199

EP - 202

BT - 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022

Y2 - 20 May 2022 through 22 May 2022

ER -

Gong Y, Li K. Context-aware Transformer Model for Crowd Localization. 在 2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022. Institute of Electrical and Electronics Engineers Inc. 2022. 页码 199-202. (2022 3rd International Conference on Computer Vision, Image and Deep Learning and International Conference on Computer Engineering and Applications, CVIDL and ICCEA 2022). doi: 10.1109/CVIDLICCEA56201.2022.9824361

Context-aware Transformer Model for Crowd Localization

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此