Identifying Technological Topic Changes in Patent Claims Using Topic Modeling

Hongshu Chen; Yi Zhang; Donghua Zhu

doi:10.1007/978-3-319-39056-7_11

Identifying Technological Topic Changes in Patent Claims Using Topic Modeling

Hongshu Chen^*, Yi Zhang, Donghua Zhu

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Chapter › peer-review

4 Citations (Scopus)

Abstract

Patent claims usually embody the core technological scope and the most essential terms to define the protection of an invention, which makes them the ideal resource for patent topic identification and theme changes analysis. However, conducting content analysis manually on massive technical terms is very time-consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency that used to rank keywords cannot separate polysemous words that are actually describing a different concept. To address this issue, this research proposes a topic change identification approach based on latent dirichlet allocation, to model and analyze topic changes and topic-based trend with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Topics are defined by probability distributions over words instead of terms and their frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented, to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.

Original language	English
Title of host publication	Innovation, Technology and Knowledge Management
Publisher	Springer
Pages	187-209
Number of pages	23
DOIs	https://doi.org/10.1007/978-3-319-39056-7_11
Publication status	Published - 2016

Publication series

Name	Innovation, Technology and Knowledge Management
ISSN (Print)	2197-5698
ISSN (Electronic)	2197-5701

Keywords

Patent analysis
Tech mining
Topic modeling

Access to Document

10.1007/978-3-319-39056-7_11

Cite this

Chen, H., Zhang, Y., & Zhu, D. (2016). Identifying Technological Topic Changes in Patent Claims Using Topic Modeling. In Innovation, Technology and Knowledge Management (pp. 187-209). (Innovation, Technology and Knowledge Management). Springer. https://doi.org/10.1007/978-3-319-39056-7_11

@inbook{5669e1ab66e340f494a15f8bb4f8668e,

title = "Identifying Technological Topic Changes in Patent Claims Using Topic Modeling",

abstract = "Patent claims usually embody the core technological scope and the most essential terms to define the protection of an invention, which makes them the ideal resource for patent topic identification and theme changes analysis. However, conducting content analysis manually on massive technical terms is very time-consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency that used to rank keywords cannot separate polysemous words that are actually describing a different concept. To address this issue, this research proposes a topic change identification approach based on latent dirichlet allocation, to model and analyze topic changes and topic-based trend with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Topics are defined by probability distributions over words instead of terms and their frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented, to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.",

keywords = "Patent analysis, Tech mining, Topic modeling",

author = "Hongshu Chen and Yi Zhang and Donghua Zhu",

note = "Publisher Copyright: {\textcopyright} 2016, Springer International Publishing Switzerland.",

year = "2016",

doi = "10.1007/978-3-319-39056-7_11",

language = "English",

series = "Innovation, Technology and Knowledge Management",

publisher = "Springer",

pages = "187--209",

booktitle = "Innovation, Technology and Knowledge Management",

address = "Germany",

}

TY - CHAP

T1 - Identifying Technological Topic Changes in Patent Claims Using Topic Modeling

AU - Chen, Hongshu

AU - Zhang, Yi

AU - Zhu, Donghua

PY - 2016

Y1 - 2016

N2 - Patent claims usually embody the core technological scope and the most essential terms to define the protection of an invention, which makes them the ideal resource for patent topic identification and theme changes analysis. However, conducting content analysis manually on massive technical terms is very time-consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency that used to rank keywords cannot separate polysemous words that are actually describing a different concept. To address this issue, this research proposes a topic change identification approach based on latent dirichlet allocation, to model and analyze topic changes and topic-based trend with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Topics are defined by probability distributions over words instead of terms and their frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented, to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.

AB - Patent claims usually embody the core technological scope and the most essential terms to define the protection of an invention, which makes them the ideal resource for patent topic identification and theme changes analysis. However, conducting content analysis manually on massive technical terms is very time-consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency that used to rank keywords cannot separate polysemous words that are actually describing a different concept. To address this issue, this research proposes a topic change identification approach based on latent dirichlet allocation, to model and analyze topic changes and topic-based trend with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Topics are defined by probability distributions over words instead of terms and their frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented, to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.

KW - Patent analysis

KW - Tech mining

KW - Topic modeling

UR - http://www.scopus.com/inward/record.url?scp=85018789889&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-39056-7_11

DO - 10.1007/978-3-319-39056-7_11

M3 - Chapter

AN - SCOPUS:85018789889

T3 - Innovation, Technology and Knowledge Management

SP - 187

EP - 209

BT - Innovation, Technology and Knowledge Management

PB - Springer

ER -

Identifying Technological Topic Changes in Patent Claims Using Topic Modeling

Abstract

Publication series

Keywords

Access to Document

Other files and links

Fingerprint

Cite this