I’m currently struggling with understanding the differences between probabilistic models like Hidden Markov Models (HMMs) and newer neural approaches such as Transformers ...
I’m currently struggling with understanding the differences between probabilistic models like Hidden Markov Models (HMMs) and newer neural approaches such as Transformers in sequence prediction tasks. I get the basic idea that HMMs rely on probabilities and assumptions like the Markov property, while neural models learn patterns from large datasets, but I’m still unclear about their practical differences.
For example, I don’t fully understand when it is more appropriate to use a traditional model like an HMM instead of a neural model. Are HMMs still useful today, or have they mostly been replaced by deep learning methods? Also, I find it confusing how these models handle long-range dependencies in language, since HMMs seem limited in capturing context compared to Transformers.
I’d like to better understand the strengths and limitations of each approach, especially in real-world NLP tasks.
