A new digital paper search paradigm based on FCA

Haibin Yu, Chongyang Shi*, Bai Yu, Chunxia Zhang, Ryan Hearne

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes a new digital paper search paradigm that controls the diversity of keyword-based search query topics based on Formal Concept Analysis (FCA). During pre-querying, papers are assigned to pre-specified, lattice-based context patterns built by a selected partial dataset, and query-independent lattice context scores are attached to papers with respect to the assigned lattice contexts. When a query is executed, the relevant lattice contexts are selected, a search is performed within the selected lattice contexts, the context scores of the papers are revised to become relevancy scores with respect to the query and the lattice context they are in, and the query outputs are ranked within each relevant lattice context. In this way, we (1) provide FCA with a path to deal with middling or larger amounts of documents, (2) minimize query output topic diversity and reduce query output size, (3) decrease the user’s time spent scanning query results, and (4) increase query output ranking accuracy. Using China National Knowledge Infrastructure (CNKI) publications as the testbed, our experiments indicate that the proposed lattice context-based search approach produces search results with up to 50% higher precision, and reduces the query output size by up to 60% more than a CNKI search.

Original languageEnglish
Pages (from-to)1099-1110
Number of pages12
JournalJournal of Internet Technology
Volume19
Issue number4
DOIs
Publication statusPublished - 2018

Keywords

  • Concept lattice
  • Formal concept analysis
  • Paper retrieval
  • Query context

Fingerprint

Dive into the research topics of 'A new digital paper search paradigm based on FCA'. Together they form a unique fingerprint.

Cite this

Yu, H., Shi, C., Yu, B., Zhang, C., & Hearne, R. (2018). A new digital paper search paradigm based on FCA. Journal of Internet Technology, 19(4), 1099-1110. https://doi.org/10.3966/160792642018081904013