NLP101: The Deep Dive into Context: Why Transformers Dominate CL

The Deep Dive into Context: Why Transformers Dominate CL

2025年 11月 23日(日曜日) 10:25 - Lê Thị Hồng Nhựt HUIT02 の投稿

Computational Linguistics has fundamentally shifted from relying on discrete, symbolic rules to utilizing dense, statistical representations to process language. The core ...

詳細...

Computational Linguistics has fundamentally shifted from relying on discrete, symbolic rules to utilizing dense, statistical representations to process language. The core driver of this shift is the need to capture context—the situational and surrounding textual information that determines meaning. Traditional models (like Naive Bayes or earlier LSTMs) struggled with polysemy and ambiguity because their word representations (embeddings) were static; the word "bank" had one fixed vector regardless of whether it referred to a river or a financial institution. The breakthrough came with the Transformer architecture (epitomized by models like BERT). These deep learning models employ an attention mechanism that dynamically generates a word's vector based on every other word in the input sequence. This creates contextual embeddings, allowing the model to differentiate the meaning of "bank" instantly and accurately depending on its use. This capability to model fluid, long-range dependencies has dramatically improved the performance of virtually every NLP task, from Machine Translation to Question Answering. While these models offer unprecedented semantic richness, the challenge now lies in increasing their explainability (XAI) and ensuring that the vast, opaque context they learn is free from harmful societal bias.