Abstract
Crowdsourcing delivers responses that are asynchronous and incomplete, making offline aggregators that assume complete response sets impractical. Prior online methods often either require per-step completeness or repeatedly reload historical responses, which is storage- and privacy-unfriendly and susceptible to forgetting. We present OLA-Incomplete, an online label-aggregation framework designed for incomplete response streams. It integrates a variational-inference aggregator with a generative replay module that preserves historical information without reloading prior responses and explicitly models unknown worker reliability. At each update step, the generator replays cumulative responses and side information for previously observed instances to mitigate catastrophic forgetting, while the aggregator infers current truths by maximizing the evidence lower bound over a mixture of replayed and newly received labels. Across three public datasets—Duck, RTE, and PostSent—OLA-Incomplete attains final accuracies of 90.74%, 92.50%, and 95.99%, respectively, delivering at least 7.79% relative improvement over the strongest baseline. The approach further exhibits strong instantaneous online accuracy and robustness across response-chunk sizes and arrival orders, underscoring its practical utility for real-world crowdsourcing workflows.
| Original language | English |
|---|---|
| Article number | 76 |
| Journal | Journal of King Saud University - Computer and Information Sciences |
| Volume | 38 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - Mar 2026 |
Keywords
- Crowdsourcing
- Generative replay
- Incomplete response
- Online label aggregation
Fingerprint
Dive into the research topics of 'Online label aggregation with incomplete crowd responses'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver