Is Least-Squares Inaccurate in Fitting Power-Law Distributions? The Criticism is Complete Nonsense

Xiaoshi Zhong*, Muyin Wang, Hongkun Zhang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

Ordinary least-squares estimation is proved to be the best linear unbiased estimator according to the Gauss-Markov theorem. In the last two decades, however, some researchers criticized that least-squares was substantially inaccurate in fitting power-law distributions; such criticism has caused a strong bias in research community. In this paper, we conduct extensive experiments to rebut that such criticism is complete nonsense. Specifically, we sample different sizes of discrete and continuous data from power-law models, showing that even though the long-tailed noises are sampled from power-law models, they cannot be treated as power-law data. We define the correct way to bin continuous power-law data into data points and propose an average strategy for least-squares to fit power-law distributions. Experiments on both simulated and real-world data show that our proposed method fits power-law data perfectly. We uncover a fundamental flaw in the popular method proposed by Clauset et al. [12]: it tends to discard the majority of power-law data and fit the long-tailed noises. Experiments also show that the reverse cumulative distribution function is a bad idea to plot power-law data in practice because it usually hides the true probability distribution of data. We hope that our research can clean up the bias about least-squares fitting power-law distributions. Source code can be found at https://github.com/xszhong/LSavg.

源语言英语
主期刊名WWW 2022 - Proceedings of the ACM Web Conference 2022
出版商Association for Computing Machinery, Inc
2748-2758
页数11
ISBN(电子版)9781450390965
DOI
出版状态已出版 - 25 4月 2022
活动31st ACM World Wide Web Conference, WWW 2022 - Virtual, Online, 法国
期限: 25 4月 202229 4月 2022

出版系列

姓名WWW 2022 - Proceedings of the ACM Web Conference 2022

会议

会议31st ACM World Wide Web Conference, WWW 2022
国家/地区法国
Virtual, Online
时期25/04/2229/04/22

指纹

探究 'Is Least-Squares Inaccurate in Fitting Power-Law Distributions? The Criticism is Complete Nonsense' 的科研主题。它们共同构成独一无二的指纹。

引用此