Differentiable Multi-Granularity Human Parsing

Tianfei Zhou; Yi Yang; Wenguan Wang

doi:10.1109/TPAMI.2023.3239194

Differentiable Multi-Granularity Human Parsing

Tianfei Zhou, Yi Yang, Wenguan Wang^*

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

31 引用（Scopus）

摘要

In this work, we study the challenging problem of instance-aware human body part parsing. We introduce a new bottom-up regime which achieves the task through learning category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. The output is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, we develop two novel algorithms based on projected gradient descent and unbalanced optimal transport, respectively, to solve the matching problem differentiablly. These algorithms make our method end-to-end trainable and allow back-propagating the grouping error to directly supervise multi-granularity human representation learning. This is significantly distinguished from current bottom-up human parsers or pose estimators which require sophisticated post-processing or heuristic greedy algorithms. Extensive experiments on three instance-aware human parsing datasets (i.e., MHP-v2, DensePose-COCO, PASCAL-Person-Part) demonstrate that our approach outperforms most existing human parsers with much more efficient inference. Our code is available at https://github.com/tfzhou/MG-HumanParsing.

源语言	英语
页（从-至）	8296-8310
页数	15
期刊	IEEE Transactions on Pattern Analysis and Machine Intelligence
卷	45
期	7
DOI	https://doi.org/10.1109/TPAMI.2023.3239194
出版状态	已出版 - 1 7月 2023
已对外发布	是

访问文件

10.1109/TPAMI.2023.3239194

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{0ef8c96f4f0b456cb2ae8c456c0f9893,

title = "Differentiable Multi-Granularity Human Parsing",

abstract = "In this work, we study the challenging problem of instance-aware human body part parsing. We introduce a new bottom-up regime which achieves the task through learning category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. The output is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, we develop two novel algorithms based on projected gradient descent and unbalanced optimal transport, respectively, to solve the matching problem differentiablly. These algorithms make our method end-to-end trainable and allow back-propagating the grouping error to directly supervise multi-granularity human representation learning. This is significantly distinguished from current bottom-up human parsers or pose estimators which require sophisticated post-processing or heuristic greedy algorithms. Extensive experiments on three instance-aware human parsing datasets (i.e., MHP-v2, DensePose-COCO, PASCAL-Person-Part) demonstrate that our approach outperforms most existing human parsers with much more efficient inference. Our code is available at https://github.com/tfzhou/MG-HumanParsing.",

keywords = "Instance-aware human semantic parsing, multi-granularity human representation, multi-person pose estimation",

author = "Tianfei Zhou and Yi Yang and Wenguan Wang",

note = "Publisher Copyright: {\textcopyright} 1979-2012 IEEE.",

year = "2023",

month = jul,

day = "1",

doi = "10.1109/TPAMI.2023.3239194",

language = "English",

volume = "45",

pages = "8296--8310",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "7",

}

TY - JOUR

T1 - Differentiable Multi-Granularity Human Parsing

AU - Zhou, Tianfei

AU - Yang, Yi

AU - Wang, Wenguan

PY - 2023/7/1

Y1 - 2023/7/1

N2 - In this work, we study the challenging problem of instance-aware human body part parsing. We introduce a new bottom-up regime which achieves the task through learning category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. The output is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, we develop two novel algorithms based on projected gradient descent and unbalanced optimal transport, respectively, to solve the matching problem differentiablly. These algorithms make our method end-to-end trainable and allow back-propagating the grouping error to directly supervise multi-granularity human representation learning. This is significantly distinguished from current bottom-up human parsers or pose estimators which require sophisticated post-processing or heuristic greedy algorithms. Extensive experiments on three instance-aware human parsing datasets (i.e., MHP-v2, DensePose-COCO, PASCAL-Person-Part) demonstrate that our approach outperforms most existing human parsers with much more efficient inference. Our code is available at https://github.com/tfzhou/MG-HumanParsing.

AB - In this work, we study the challenging problem of instance-aware human body part parsing. We introduce a new bottom-up regime which achieves the task through learning category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. The output is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, we develop two novel algorithms based on projected gradient descent and unbalanced optimal transport, respectively, to solve the matching problem differentiablly. These algorithms make our method end-to-end trainable and allow back-propagating the grouping error to directly supervise multi-granularity human representation learning. This is significantly distinguished from current bottom-up human parsers or pose estimators which require sophisticated post-processing or heuristic greedy algorithms. Extensive experiments on three instance-aware human parsing datasets (i.e., MHP-v2, DensePose-COCO, PASCAL-Person-Part) demonstrate that our approach outperforms most existing human parsers with much more efficient inference. Our code is available at https://github.com/tfzhou/MG-HumanParsing.

KW - Instance-aware human semantic parsing

KW - multi-granularity human representation

KW - multi-person pose estimation

UR - http://www.scopus.com/inward/record.url?scp=85148466195&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2023.3239194

DO - 10.1109/TPAMI.2023.3239194

M3 - Article

C2 - 37022259

AN - SCOPUS:85148466195

SN - 0162-8828

VL - 45

SP - 8296

EP - 8310

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 7

ER -

Differentiable Multi-Granularity Human Parsing

摘要

访问文件

其它文件与链接

指纹

引用此