TY - GEN
T1 - Integrated Parallel System for Audio Conferencing Voice Transcription and Speaker Identification
AU - Miao, Ke
AU - Biermann, Oloff
AU - Miao, Zhen
AU - Leung, Simon
AU - Wang, Jianhong
AU - Gai, Keke
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.
AB - In response to the request from a well-known international financial corporation, an integrated system prototype was architected and implemented to automatically record corporate audio conferencing, transcribe the recordings to text while identifying speakers, and compile the transcription and identification results into text-based meeting minutes, which then gets sent as meeting summary email attachments as well as saved into a meeting management database. Three technology focuses of this integrated system are discussed in this paper 1) Selection of a 3rd-party audio transcription and identification API (Audio API) through prototyping, factor comparison, and considering the existing technology environment at the corporation. 2) Optimize the adoption of the selected Audio API based on knowledge of Natural Language Process (NLP) methods. 3) Support asynchronous scheduling and processing of concurrent meetings using parallel computing architecture methods. The completed system was evaluated and shown to have met all the requirements from the corporation, perform well in audio language intelligent processing and multi-Threaded parallel execution. With further enhancements, we foresee this system solution has good commercial values and has potential to be adopted widely among other businesses.
KW - Integrated System
KW - Multi-Threading
KW - Natural Language Processing
KW - Parallel Computing
KW - Speaker Identification
KW - Speech Transcription
KW - factor-based service selection
UR - http://www.scopus.com/inward/record.url?scp=85092000682&partnerID=8YFLogxK
U2 - 10.1109/HPBDIS49115.2020.9130598
DO - 10.1109/HPBDIS49115.2020.9130598
M3 - Conference contribution
AN - SCOPUS:85092000682
T3 - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020
BT - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Conference on High Performance Big Data and Intelligent Systems, HPBD and IS 2020
Y2 - 23 May 2020
ER -