摘要
Subsampling focuses on selecting a subsample that can efficiently sketch the information of the original data in terms of statistical inference. It provides a powerful tool in big data analysis and gains the attention of data scientists in recent years. In this review, some state-of-the-art subsampling methods inspired by statistical design are summarized. Three types of designs, namely optimal design, orthogonal design, and space filling design, have shown their great potential in subsampling for different objectives. The relationships between experimental designs and the related subsampling approaches are discussed. Specifically, two major families of design inspired subsampling techniques are presented. The first aims to select a subsample in accordance with some optimal design criteria. The second tries to find a subsample that meets some design requirements, including balancing, orthogonality, and uniformity. Simulated and real data examples are provided to compare these methods empirically.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 467-510 |
| 页数 | 44 |
| 期刊 | Statistical Papers |
| 卷 | 65 |
| 期 | 2 |
| DOI | |
| 出版状态 | 已出版 - 4月 2024 |
指纹
探究 'A review on design inspired subsampling for big data' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver