TY - GEN
T1 - ForestCast
T2 - 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
AU - Yu, Zi
AU - Wang, Shaoxiang
AU - Li, Guozheng
AU - Zhang, Yu
AU - Liu, Chi Harold
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - Open-ended event forecasting (OEEF) seeks to predict future events from a given context without being restricted to a predefined scope or format. It plays a crucial role in domains such as risk management and financial decision making. Although large language models show potential for OEEF, existing approaches and datasets often overlook the complex relationships among events, and current research lacks comprehensive evaluation methods. To address these limitations, we propose ForestCast, a prediction pipeline that extracts forecast-relevant events from news data, organizes them into a story tree, and predicts subsequent events along each path. The pipeline comprises four steps: (1) clustering news into event nodes, (2) constructing a news story tree, (3) mining the semantic structure of the tree, and (4) predicting the next event node and evaluating prediction quality. To support this pipeline, we construct NewsForest, a dataset of 12,406 event chains, each representing a chronologically and logically linked sequence of news events. In addition, we introduce a comprehensive evaluation framework that measures both the accuracy and the quality of prediction. Experimental results demonstrate that ForestCast improves the ability of LLMs to forecast events in news data.
AB - Open-ended event forecasting (OEEF) seeks to predict future events from a given context without being restricted to a predefined scope or format. It plays a crucial role in domains such as risk management and financial decision making. Although large language models show potential for OEEF, existing approaches and datasets often overlook the complex relationships among events, and current research lacks comprehensive evaluation methods. To address these limitations, we propose ForestCast, a prediction pipeline that extracts forecast-relevant events from news data, organizes them into a story tree, and predicts subsequent events along each path. The pipeline comprises four steps: (1) clustering news into event nodes, (2) constructing a news story tree, (3) mining the semantic structure of the tree, and (4) predicting the next event node and evaluating prediction quality. To support this pipeline, we construct NewsForest, a dataset of 12,406 event chains, each representing a chronologically and logically linked sequence of news events. In addition, we introduce a comprehensive evaluation framework that measures both the accuracy and the quality of prediction. Experimental results demonstrate that ForestCast improves the ability of LLMs to forecast events in news data.
UR - https://www.scopus.com/pages/publications/105028964932
U2 - 10.18653/v1/2025.findings-emnlp.678
DO - 10.18653/v1/2025.findings-emnlp.678
M3 - Conference contribution
AN - SCOPUS:105028964932
T3 - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
SP - 12667
EP - 12681
BT - EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
A2 - Christodoulopoulos, Christos
A2 - Chakraborty, Tanmoy
A2 - Rose, Carolyn
A2 - Peng, Violet
PB - Association for Computational Linguistics (ACL)
Y2 - 4 November 2025 through 9 November 2025
ER -