Explore the hierarchical structure of language, where words coalesce to form phrases and meaning emerges from structure.
Sentences aren't just strings of words; they are hierarchical structures. Words group into phrases, and phrases nest within larger constituents.
The central word determining the category. E.g., "Book" is the head of the Noun Phrase "The old book".
We have NP (Noun Phrase), VP (Verb Phrase), and PP (Prepositional Phrase).
👆 Click a node (S, NP, VP) to explore its function.
How do linguists prove a group of words is a constituent? We use diagnostic tests. Hover over the cards to see them in action.
Can the phrase be replaced by a single word (like a pronoun)?
Can the phrase move to a different position in the sentence?
Can it be linked with a similar phrase using 'and' or 'or'?
Can you visually isolate the constituent boundaries?
The evolution from rule-based algorithms to deep learning.
Like building a family tree in reverse. Starts from the root (S) and expands using grammar rules until words are matched.
Like a jigsaw puzzle. Starts with individual words ("Shift") and glues them into constituents ("Reduce").
The smart "memoization" method. Stores intermediate results in a chart to avoid re-analyzing the same phrase.
Traditional rules can be rigid. Probabilistic Context-Free Grammars (PCFGs) assign probabilities to rules, allowing the parser to choose the most likely interpretation among many ambiguities.
P(S -> NP VP) = 0.95
Recurrent Neural Networks process words sequentially, excellent for capturing patterns over time.
The modern standard. Uses Self-Attention mechanisms to analyze the entire sentence context simultaneously, handling long-range dependencies effortlessly.
Parsers extract "who did what to whom".
E.g., identifying that "Company A" is the acquirer and "Company B" is the target.
Decodes informal text. Handles hashtags, slang, and emojis by analyzing the underlying clause structure hidden in the noise.
Vital for reordering words. E.g., converting SVO (English) to SOV (Japanese) requires knowing where the constituents are.