Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification

Ke Miao; Oloff Biermann; Zhen Miao; Simon Leung; Jianhong Wang; Keke Gai

doi:10.1109/HPBDIS49115.2020.9130598

Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification

Ke Miao, Oloff Biermann, Zhen Miao, Simon Leung, Jianhong Wang, Keke Gai

网络空间安全学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

3 引用（Scopus）

摘要

In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.

源语言	英语
主期刊名	2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020
出版商	Institute of Electrical and Electronics Engineers Inc.
ISBN（电子版）	9781728165110
DOI	https://doi.org/10.1109/HPBDIS49115.2020.9130598
出版状态	已出版 - 5月 2020
活动	2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020 - Shenzhen, 中国期限: 23 5月 2020 → …

出版系列

姓名	2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020

会议

会议	2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020
国家/地区	中国
市	Shenzhen
时期	23/05/20 → …

访问文件

10.1109/HPBDIS49115.2020.9130598

其它文件与链接

链接到 Scopus 的出版物

引用此

Miao, K., Biermann, O., Miao, Z., Leung, S., Wang, J., & Gai, K. (2020). Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification. 在 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020 文章 9130598 (2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HPBDIS49115.2020.9130598

Miao, Ke ; Biermann, Oloff ; Miao, Zhen 等. / Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification. 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020. Institute of Electrical and Electronics Engineers Inc., 2020. (2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020).

@inproceedings{1f1c83b488e14c689439a596badd9ae1,

title = "Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification",

abstract = "In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.",

keywords = "Integrated System, Multi-Threading, Natural Language Processing, Parallel Computing, Speaker Identification, Speech Transcription, factor-based service selection",

author = "Ke Miao and Oloff Biermann and Zhen Miao and Simon Leung and Jianhong Wang and Keke Gai",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020 ; Conference date: 23-05-2020",

year = "2020",

month = may,

doi = "10.1109/HPBDIS49115.2020.9130598",

language = "English",

series = "2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020",

address = "United States",

}

Miao, K, Biermann, O, Miao, Z, Leung, S, Wang, J & Gai, K 2020, Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification. 在 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020., 9130598, 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020, Institute of Electrical and Electronics Engineers Inc., 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020, Shenzhen, 中国, 23/05/20. https://doi.org/10.1109/HPBDIS49115.2020.9130598

Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification. / Miao, Ke; Biermann, Oloff; Miao, Zhen 等.
2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020. Institute of Electrical and Electronics Engineers Inc., 2020. 9130598 (2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification

AU - Miao, Ke

AU - Biermann, Oloff

AU - Miao, Zhen

AU - Leung, Simon

AU - Wang, Jianhong

AU - Gai, Keke

PY - 2020/5

Y1 - 2020/5

N2 - In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.

AB - In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.

KW - Integrated System

KW - Multi-Threading

KW - Natural Language Processing

KW - Parallel Computing

KW - Speaker Identification

KW - Speech Transcription

KW - factor-based service selection

UR - http://www.scopus.com/inward/record.url?scp=85092000682&partnerID=8YFLogxK

U2 - 10.1109/HPBDIS49115.2020.9130598

DO - 10.1109/HPBDIS49115.2020.9130598

M3 - Conference contribution

AN - SCOPUS:85092000682

T3 - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020

BT - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020

Y2 - 23 May 2020

ER -

Miao K, Biermann O, Miao Z, Leung S, Wang J, Gai K. Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification. 在 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020. Institute of Electrical and Electronics Engineers Inc. 2020. 9130598. (2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020). doi: 10.1109/HPBDIS49115.2020.9130598

Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此