Name: Chai Chengliang
Discipline: Computer Science and Technology
Title: Pre-appointed Associate Professor (Special Researcher), Doctoral Supervisor
Contact number:
E-mail: ccl@bit.edu.cn
Address: Personal Information
Chai Chengliang is a pre-appointed Associate Professor (Special Researcher), doctoral supervisor, and winner of CCF Outstanding Doctoral Thesis Award at the School of Computer Science, Beijing Institute of Technology. He received his bachelor's degree from Harbin Institute of Technology in 2015 and his Doctor's degree from Tsinghua University in 2020 and 2022. He has published nearly 40 CCF Class A papers, including SIGMOD, VLDB, ICDE, KDD, TKDE and VLDBJ. He was awarded CCF Excellent Doctoral Thesis Award (Top10 in China), ACM China Excellent Doctoral Thesis Award (Top 2 in China), Forbes China 30 Under 30 List, Baidu Scholarship (Top10 in the world) and other awards. In terms of academic services, served as the contributing editor of JCST, an international high-level SCI journal; Member of KDD, ICDE, VLDB, AAAI, ICDCS and other top-level international conference procedure committee; Academic Director, CCF Frontiers Workshop; Executive Member of China Database Committee; I have given 3-hour coaching reports at top international conferences SIGMOD 2021, KDD 2018, ICDE 2019.
Research Direction 1 - Data-centric AI: In the era of artificial intelligence, algorithms, computing power and data have become indispensable three elements. Existing research focuses on machine learning algorithms, but data is also very important. The main research is how to improve the model effect from the perspective of data, including artificial intelligence-oriented data discovery, data cleaning, data fusion, data annotation and data consanguinity.
Research direction 2 - Data Lake system: In the era of multi-source heterogeneous big data, data lake is widely used because it can efficiently store various data in the original format, and the data stored can effectively support data analysis and artificial intelligence algorithms. The main research is how to index data in data lake, how to efficiently retrieve data to support artificial intelligence, and Lakehouse system that supports both data warehouse and data lake.
2023.08 Update: Recruit 1 person in 2024.
Research Direction
Artificial Intelligence, Data Science, Data Lakes, Database Systems
Personal Information
Chai Chengliang is a pre-appointed Associate Professor (Special Researcher), doctoral supervisor, and winner of CCF Outstanding Doctoral Thesis Award at the School of Computer Science, Beijing Institute of Technology. He received his bachelor's degree from Harbin Institute of Technology in 2015 and his Doctor's degree from Tsinghua University in 2020 and 2022. He has published nearly 40 CCF Class A papers, including SIGMOD, VLDB, ICDE, KDD, TKDE and VLDBJ. He was awarded CCF Excellent Doctoral Thesis Award (Top10 in China), ACM China Excellent Doctoral Thesis Award (Top 2 in China), Forbes China 30 Under 30 List, Baidu Scholarship (Top10 in the world) and other awards. In terms of academic services, served as the contributing editor of JCST, an international high-level SCI journal; Member of KDD, ICDE, VLDB, AAAI, ICDCS and other top-level international conference procedure committee; Academic Director, CCF Frontiers Workshop; Executive Member of China Database Committee; I have given 3-hour coaching reports at top international conferences SIGMOD 2021, KDD 2018, ICDE 2019.
Research Direction 1 - Data-centric AI: In the era of artificial intelligence, algorithms, computing power and data have become indispensable three elements. Existing research focuses on machine learning algorithms, but data is also very important. The main research is how to improve the model effect from the perspective of data, including artificial intelligence-oriented data discovery, data cleaning, data fusion, data annotation and data consanguinity.
Research direction 2 - Data Lake system: In the era of multi-source heterogeneous big data, data lake is widely used because it can efficiently store various data in the original format, and the data stored can effectively support data analysis and artificial intelligence algorithms. The main research is how to index data in data lake, how to efficiently retrieve data to support artificial intelligence, and Lakehouse system that supports both data warehouse and data lake.
2023.08 Update: Recruit 1 person in 2024.
Personal Information
Chai Chengliang is a pre-appointed Associate Professor (Special Researcher), doctoral supervisor, and winner of CCF Outstanding Doctoral Thesis Award at the School of Computer Science, Beijing Institute of Technology. He received his bachelor's degree from Harbin Institute of Technology in 2015 and his Doctor's degree from Tsinghua University in 2020 and 2022. He has published nearly 40 CCF Class A papers, including SIGMOD, VLDB, ICDE, KDD, TKDE and VLDBJ. He was awarded CCF Excellent Doctoral Thesis Award (Top10 in China), ACM China Excellent Doctoral Thesis Award (Top 2 in China), Forbes China 30 Under 30 List, Baidu Scholarship (Top10 in the world) and other awards. In terms of academic services, served as the contributing editor of JCST, an international high-level SCI journal; Member of KDD, ICDE, VLDB, AAAI, ICDCS and other top-level international conference procedure committee; Academic Director, CCF Frontiers Workshop; Executive Member of China Database Committee; I have given 3-hour coaching reports at top international conferences SIGMOD 2021, KDD 2018, ICDE 2019.
Research Direction 1 - Data-centric AI: In the era of artificial intelligence, algorithms, computing power and data have become indispensable three elements. Existing research focuses on machine learning algorithms, but data is also very important. The main research is how to improve the model effect from the perspective of data, including artificial intelligence-oriented data discovery, data cleaning, data fusion, data annotation and data consanguinity.
Research direction 2 - Data Lake system: In the era of multi-source heterogeneous big data, data lake is widely used because it can efficiently store various data in the original format, and the data stored can effectively support data analysis and artificial intelligence algorithms. The main research is how to index data in data lake, how to efficiently retrieve data to support artificial intelligence, and Lakehouse system that supports both data warehouse and data lake.
2023.08 Update: Recruit 1 person in 2024.
代表性学术成果
*表示通讯作者
[1] Chengliang Chai, Nan Tang, Ju Fan, Yuyu Luo Demystifying Artificial Intelligence for Data Preparation SIGMOD 2023 (CCF A).
[2] Chengliang Chai, Jiabin Liu, Nan Tang, Guoliang Li Selective Data Acquisition in the Wild for Model Charging VLDB 2022 (CCF A).
[3] Chengliang Chai, Jiayi Wang, Yuyu Luo, Zeping Niu, Guoliang Li Data Management for Machine Learning: A Survey (CCF A).
[4] Chengliang Chai, Guoliang Li, Ju Fan, et al. CrowdChart: Crowdsourced Data Extraction from Visualization Charts TKDE 2021 (CCF A).
[5] Chengliang Chai, Lei Cao, Jian Li, Guoliang Li, Yuyu Luo, Samuel Madden Human-in-the-loop Outlier Detection. SIGMOD 2020 (CCF A).
[6] Chengliang Chai, Ju Fan, Guoliang Li. Incentive-Based Entity Collection Using Crowdsourcing ICDE 2018. (CCF A).
[7] Chengliang Chai, Guoliang Li, Jian Li, et al. A Partial-order-based Framework for Cost-effective Crowdsourced Entity Resolution VLDB Journal, 2018 (CCF A).
[8] Chengliang Chai, Guoliang Li, Jian Li et al. Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach SIGMOD 2016. (CCF A).
[9] Chengliang Chai, Guoliang Li, Ju Fan, Yuyu Luo Crowdsourcing Data Extraction from Visualization Chart ICDE, 2020. (CCF A).
[10] Chengliang Chai, Ju Fan, Guoliang Li, Jiannan Wang, Yudian Zheng. Crowdsourcing Database Systems: Overview and Challenges ICDE, 2019. (CCF A).
[11] Jiayi Wang, Chengliang Chai*, Nan Tang, Jiabin Liu, Guoliang Li Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning VLDB 2023 (CCF A).
[12] Dynamic Materialized View Management using Graph Neural Network Yue Han, Chengliang Chai*, Jiabin Liu, Guoliang Li, Chuangxian Wei, Chaoqun Zhan ICDE 2023 (CCF A).
[13] Lixi Zhang, Chengliang Chai*, Xuanhe Zhou, Guoliang Li LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning SIGMOD 2022 (CCF A).
[14] Xiang Yu, Chengliang Chai*, Guoliang Li, Jiabin Liu Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection VLDB 2022 (CCF A).
[15] Jiayi Wang, Chengliang Chai*, Jiabin Liu, Guoliang Li FACE: A Normalizing Flow based Cardinality Estimator VLDB 2022 (CCF A).
[16] Xuedi Qin, Chengliang Chai*, Nan Tang, Jian Li, Yuyu Luo, Guoliang Li, Yaoyu Zhu, Synthesizing Entity Resolution Datasets ICDE 2022 (CCF A).
[17] Jiabin Liu, Chengliang Chai*, Yuyu Luo, Yin Lou, Jianhua Feng, Nan Tang Feature Augmentation with Reinforcement Learning ICDE 2022 (CCF A).
[18] RW-tree: A Learned Workload-aware Framework for R-tree Construction Haowen Dong, Chengliang Chai*, Yuyu Luo, Jiabin Liu, Guoliang Li ICDE 2022 (CCF A).
[19] Xuedi Qin, Chengliang Chai*, Yuyu Luo, Tianyu Zhao, Nan Tang, Guoliang Li, Xiang Yu, Mourad Ouzzani Interactively Discovering and Ranking Desired Tuples by Data Exploration VLDBJ 2021 (CCF A).
[20] Jiabin Liu, Fu Zhu, Chengliang Chai*, Yuyu Luo, Nan Tang Automatic Data Acquisition for Deep Learning VLDB 2021 (CCF A).
[21] Xuedi Qin, Chengliang Chai*, Yuyu Luo, Nan Tang, Guoliang Li Ranking Desired Tuples by Database Exploration ICDE 2021 (CCF A).
[22] Xuanhe Zhou, Chengliang Chai*, Guoliang Li, Ji Sun DB Meets AI: A Survey TKDE 2020 (CCF A).
[23] Yuyu Luo, Xuedi Qin, Chengliang Chai*, Nan Tang, Guoliang Li Steerable Self-driving Data Visualization TKDE, 2020 (CCF A).
[24] Xuanhe Zhou, Chengliang Chai*, Guoliang Li, Ji Sun Database Meets Artificial Intelligence: A Survey. TKDE, 2020. (CCF A).
[25] Yuyu Luo, Chengliang Chai*, Xuedi Qin, Guoliang Li, Nan Tang Interactive Cleaning for Progressive Visualization through Composite Questions ICDE, 2020. (CCF A).
所获奖励
[1] CCF优秀博士论文奖
[2] ACM中国优秀博士论文奖
[3] 福布斯中国30Under30
[4] 博士后创新人才计划 [5] 百度奖学金
[6] 国家电网科学技术进步一等奖
[7] 浙江省科学技术进步二等奖
[8] 之江实验室—国际青年人才优秀成果奖
[9] 清华大学优秀博士后
[10] 清华大学优秀博士毕业生