Unravelling the semantic mysteries of transformers layer by layer

  • Cheng Zhang
  • , Jinxin Lv
  • , Jingxu Cao
  • , Jiachuan Sheng*
  • , Dawei Song
  • , Tiancheng Zhang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Despite the significant success of transformer models and their successors in various natural language processing (NLP) applications, their internal workings are still not fully understood. Much of the current interpretability research has focused primarily on numerical components, often missing the complex semantic layers within these models. To fill this gap, this study explores the interpretability of the transformer model, a cornerstone of modern NLP, by addressing the semantic complexities of its multi-layer architecture. We identify three key questions: (i) the influence of the multi-layer structure on semantic processing, (ii) the unique contributions of each layer to model performance, and (iii) methodologies for determining optimal layer counts for the encoder and decoder. To tackle these issues, we introduce the semantic interpreter for transformer hierarchy, an innovative framework that employs convex hull metrics to visualize and assess semantic quality and quantity. Our contributions include novel methods for semantic assessment, a dual analytical framework that integrates cumulative and layer-to-layer perspectives, and insights into the dynamics of encoding and decoding. This comprehensive approach aims to enhance the understanding of Transformer models, ultimately guiding their refinement for improved interpretability and effectiveness in natural language tasks.

Original languageEnglish
Pages (from-to)1237-1251
Number of pages15
JournalComputer Journal
Volume68
Issue number9
DOIs
Publication statusPublished - 1 Sept 2025
Externally publishedYes

Fingerprint

Dive into the research topics of 'Unravelling the semantic mysteries of transformers layer by layer'. Together they form a unique fingerprint.

Cite this