TY - JOUR
T1 - A new digital paper search paradigm based on FCA
AU - Yu, Haibin
AU - Shi, Chongyang
AU - Yu, Bai
AU - Zhang, Chunxia
AU - Hearne, Ryan
N1 - Publisher Copyright:
© 2018 Taiwan Academic Network Management Committee. All rights reserved.
PY - 2018
Y1 - 2018
N2 - This paper proposes a new digital paper search paradigm that controls the diversity of keyword-based search query topics based on Formal Concept Analysis (FCA). During pre-querying, papers are assigned to pre-specified, lattice-based context patterns built by a selected partial dataset, and query-independent lattice context scores are attached to papers with respect to the assigned lattice contexts. When a query is executed, the relevant lattice contexts are selected, a search is performed within the selected lattice contexts, the context scores of the papers are revised to become relevancy scores with respect to the query and the lattice context they are in, and the query outputs are ranked within each relevant lattice context. In this way, we (1) provide FCA with a path to deal with middling or larger amounts of documents, (2) minimize query output topic diversity and reduce query output size, (3) decrease the user’s time spent scanning query results, and (4) increase query output ranking accuracy. Using China National Knowledge Infrastructure (CNKI) publications as the testbed, our experiments indicate that the proposed lattice context-based search approach produces search results with up to 50% higher precision, and reduces the query output size by up to 60% more than a CNKI search.
AB - This paper proposes a new digital paper search paradigm that controls the diversity of keyword-based search query topics based on Formal Concept Analysis (FCA). During pre-querying, papers are assigned to pre-specified, lattice-based context patterns built by a selected partial dataset, and query-independent lattice context scores are attached to papers with respect to the assigned lattice contexts. When a query is executed, the relevant lattice contexts are selected, a search is performed within the selected lattice contexts, the context scores of the papers are revised to become relevancy scores with respect to the query and the lattice context they are in, and the query outputs are ranked within each relevant lattice context. In this way, we (1) provide FCA with a path to deal with middling or larger amounts of documents, (2) minimize query output topic diversity and reduce query output size, (3) decrease the user’s time spent scanning query results, and (4) increase query output ranking accuracy. Using China National Knowledge Infrastructure (CNKI) publications as the testbed, our experiments indicate that the proposed lattice context-based search approach produces search results with up to 50% higher precision, and reduces the query output size by up to 60% more than a CNKI search.
KW - Concept lattice
KW - Formal concept analysis
KW - Paper retrieval
KW - Query context
UR - http://www.scopus.com/inward/record.url?scp=85052015429&partnerID=8YFLogxK
U2 - 10.3966/160792642018081904013
DO - 10.3966/160792642018081904013
M3 - Article
AN - SCOPUS:85052015429
SN - 1607-9264
VL - 19
SP - 1099
EP - 1110
JO - Journal of Internet Technology
JF - Journal of Internet Technology
IS - 4
ER -