Fillmore’s Dangerous Idea

He considered himself to be an “OWL”, just another Ordinary Working Linguist. To others, however, he is “a towering figure among linguists without writing a single book”. As a linguistics student, it is sadly possible to be unaware of his name, but it is impossible to be working without his ideas, which have revolutionized our understanding of language and meaning in many ways. I am writing of course about Charles J. Fillmore, or “Chuck” as his friends and colleagues would call him, a man who has deserved his place in the pantheon of linguistics among the likes of Ferdinand de Saussure, Edward Sapir and Roman Jacobson. Still, there is one idea of Fillmore that hasn’t reached its full potential yet in linguistics, a dangerous idea that changes everything…

Fillmore’s idea is dangerous because it dismantles most of what modern linguistics has come to believe since the Chomskyan revolution, which started at the end of the 1950s. During his acceptance speech for the ACL Lifetime Achievement Award in 2012, Fillmore vividly recalled how Noam Chomsky transformed the field of linguistics from a discipline of discovery procedures into a science of theory construction.9 The job of a linguist was no longer to describe the structures of a particular language, but to seek a small set of universal and abstract rules that reside in a speaker’s mind, which together form a speaker’s linguistic “competence”.

Mind you: Fillmore was never someone who sought to spoil Chomsky’s party. Whereas many of Chomsky’s detractors hate the Chomskyan perspective with vitriol (sometimes even making a career out of anti-Chomskyanism), Fillmore considered himself as working faithfully within the new mainstream; an OWL who simply wanted to make his contribution to the field. But he always followed the data, and he kept his mind open to insights from other approaches to language, particularly those from European linguistics, including Lucien Tesnière‘s dependency structures, and the Prague and Moscow schools with their attention to phraseology.

What he found was this: the mainstream approach didn’t rhyme with the data. One particularly glaring problem was the arbitrariness of the mainstream theory – and not the kind of arbitrariness that we know from De Saussure.4 In order to keep the set of abstract principles of a linguistic theory small, mainstream linguistics only looked at “core” structures that conveniently fit the principles, while sweeping all counterexamples under the rug of “periphery” or “performance”, so they were considered to be irrelevant for a theory of linguistic competence. This practice was immensely frustrating to Fillmore and likeminded scholars who wanted to explain the whole of language rather than just parts of it; or as George Lakoff exasperatedly said: “Chomsky’s shifting definitions of performance provide him with a rug big enough to cover the Himalayas.”24

But even if you are willing to turn a blind eye to the core-vs-periphery problem, Fillmore discovered an even bigger issue.

One of the greatest pleasures of reading Fillmore’s papers is that he never felt the need to showcase his vast knowledge about linguistics, but instead wrote with an engaging sense of wonder and playfulness. Already in his earlier papers – some of them collected in the book Form and Meaning in Language,8 which should be on the nightstand of any linguist – he lays bare the biggest problem of mainstream linguistics: meaning, and how it is expressed in language. One of my personal favorites is The grammar of hitting and breaking, in which he almost casually throws examples such as the following at the reader’s feet:

  • I broke the window (with a rock).

  • The rock broke the window.

  • The window broke.

The genius, here, is that it only takes three simple sentences to put the finger where it hurts; and it hurts badly because the issue at hand concerns argument realization – or the way in which verbs combine with their arguments to form sentences. Quite a number of people think of argument realization as the bread and butter of contemporary linguistics; the battlefield on which linguistic theories survive or perish. In order to thrive, any theory in mainstream linguistics is supposed to come up with an argument realization principle that can explain the structures in which verbs occur, while at the same time predicting which kinds of structures will not occur. For instance, the argument realization principle should somehow be able to prevent an intransitive verb such as laugh from occurring in transitive structures such as I laughed a joke.

Because some ironies are too good to be ignored, I cannot resist the temptation of choosing the Projection Principle as an example from all of the argument realization principles that have been proposed in the literature. Before the Projection Principle, Chomskyan linguistics hypothesized that words had to be inserted into the empty slots of independently generated syntactic structures. This approach, however, led to massive “overgeneration”: the grammar would license any verb to occur in any verb slot, and has therefore no way of excluding I laughed a joke. The solution was to get rid of the phrase building rules and instead put constraints on lexical items. For instance, the verb laugh can be categorized as an intransitive verb through a subcategorization frame, which then needs to be projected from the lexicon onto a syntactic structure. Laugh can now no longer occur in a transitive structure, because that would be a violation of the verb’s subcategorization frame and the Projection Principle. Nowadays, the Projection Principle has been discarded, but most contemporary linguistic theories have adopted some kind of spiritual successor in which some abstract argument realization principle is combined with a richly specified lexicon. This approach is known as Lexicalism.

Now, back to Fillmore’s three examples and why the term Projection Principle is ironic. When syntactic structures are constrained by the properties of the lexicon, you soon get into problems because virtually every verb can occur in multiple argument realization patterns, as Fillmore’s examples show for the verb break. For a lexicalist theory, this means that at least three different lexical items are required to account for the three examples. For instance, if you have assigned a transitive subcategorization frame to break, the verb is no longer permitted to form intransitive sentences such as the window broke. Lexicalists therefore need to call upon additional rules to ensure the validity of the theory’s argument realization principle.  In Chomskyan linguistics, this is quite radically done by starting from the verb’s subcategorization frame and then applying transformation rules onto the syntactic structure. In non-transformational, Chomsky-inspired linguistics, transformations are disguised as “derivational rules” that change the subcategorization frame of a verb, after which the default argument realization principle applies.

The irony of the term Projection Principle of course lies in the meaning of the word Projection in psychology, where it denotes a self-defense mechanism that humans use in which they attribute certain qualities onto others without any evidence that such an attribution is truthful. Similarly, the Projection Principle and its successors became a defense mechanism for a linguistic theory because you were allowed to impose any structure on a sentence to fit the theory, even if that involved non-observable elements and transformations. Just take a brief look at the many linguistics terms out there, some of them still in use today: raising, WH-movement, and island constraints; extraction, fillers, and parasitic gaps; focalization, topical fronting, and subject-verb inversion; deep structure, surface structure and phonological spell-out; heavy NP-shift, case checking, and empty traces; and the list goes on and on. Particularly in Chomskyan linguistics, whenever the linguists’ projections got out of hand and the theory became too messy and complicated, Chomsky would step in to attempt a purge. The Standard Theory became the Extended Standard Theory, which became the Revised Extended Standard Theory. Purge. The Principles and Parameters model resulted in the Government and Binding Theory. Purge. Now Chomskyan linguistics no longer has a theory, only the Minimalist “program”, and even that has been reportedly questioned by Chomsky himself.11

Fillmore didn’t fall into the projection trap, but let the data guide him and found that meaning was the key to solving such problems. He observed that all three examples of the verb break can be conceived of as different conceptualizations of the same event, each one entailing the same consequences for the window, the same usage of a rock, and the same perpetrator. He realized that verbs can have a core meaning that is largely independent from the argument realization patterns that speakers choose for expressing that meaning. Such observations became the foundation of what is now known as Frame Semantics5, another one of Fillmore’s ideas that changed the face of linguistics. They ultimately also lead to the question: why would you need different lexical items if each time the same verb sense is involved?13,14

We finally get to the Dangerous Idea of this essay. While Fillmore was developing Frame Semantics, he was also deeply concerned with the first problem of core versus periphery, and the lack of tools for linguists to address those issues. What happened next is almost of divine providence, because Fillmore possessed a skill that is usually the expertise of computer scientists: he understood the concept of representations. He learned this while studying Japanese: he noticed that the Japanese kana syllabic writing system does not present all of the necessary information for pronouncing complex words, because Japanese does not segment such words at syllable boundaries. Fillmore later said about this experience: “[…I]t is not possible to represent — in a writing system, in a parse, or in a grammar — every aspect of a language worth noticing. My study of Japanese confronted me with the realization that for any given representation system, it’s important to understand what it represents, and what is missing.9 His understanding of representations also made him realize that the toolkit of linguistics was lacking a data structure for accounting for the whole of language. So he invented constructions.

I hear some readers protesting at this point. Wait, what? Are constructions the Dangerous Idea? Construction grammar is already quite accepted in cognitive linguistics… How can you claim that the constructional approach has not yet reached its full potential? Yes, I am indeed talking about constructions, and yes, many linguists claim a constructional approach. But browsing through the linguistic literature, I fear that the word construction is often used in a business-as-usual kind of way, without exploring the full potential of Fillmore’s original vision.

Allow me to elaborate on that statement. My ultimate favorite paper by Fillmore is The Mechanisms of “Construction Grammar” (1988),6 which I reread about once every three months. In his paper, Fillmore clearly lays out his vision about construction grammar. I quote here the four properties that I find most relevant for this essay, and added emphasis to those things you should pay attention to:

  1. By grammatical construction we mean any syntactic pattern which is assigned one or more conventional functions in a language, together with whatever is linguistically conventionalized about its contribution to the meaning or the use of structures containing it. (p. 36)

  2. Construction grammars differ from phrase-structure grammars which use complex symbols and allow the transmission of information between lower and higher structural units, in that we allow the direct representation of the required properties of subordinate constituents. (p. 35)

  3. And construction grammars differ from phrase­-structure grammars in general in allowing an occurring linguistic expression to be seen as simultaneously instantiating more than one grammatical construction at the same level.” (p. 35)

  4. At least some of the grammatical properties of a construction can be given as feature structure representations, that is, as sets of attribute­-value pairs, and can be seen as generally satisfying the requirements of a unification-based system. (p. 38)

Since linguists like food as much as any other person, the difference between traditional linguistics and construction grammar is sometimes described as eating a cake with multiple layers of flavors. A traditional linguist has to eat the cake layer by layer (a scoop of phonology, a bite of syntax, and a topping of semantics), but a construction grammarian can cut the cake vertically and gobble it up as a whole because constructions can represent all information at once. Construction grammarians typically present this as the selling point of a construction, but computational and formal linguists might scratch their heads over that and wonder what the fuss is all about.

Let’s first understand why some linguists might feel puzzled about why constructions are important. In computational linguistics and formal grammar, it has already been possible to represent different kinds of linguistic information in a uniform way since Martin Kay invented feature structures and unification-based language processing in the late 1970s.18 Indeed, one of the appeals of a theory such as Head-Driven Phrase Structure Grammar,25 which reinterpreted such feature-value pairs as sets of constraints, is that it uses a formalism that allows you to include various kinds of information in one grammar rule, ranging from phonology and morphosyntax to semantic and pragmatic constraints. Moreover, grammar formalisms that offer more expressive power than phrase-structure grammars already allow for Fillmore’s second criterion: to directly reach subordinate constituents that are beyond the locality of mother and daughter nodes in a syntax tree. Tree-Adjoining Grammars, for example, have been doing exactly that for about half a century now.17

But here’s the thing: what if the structure of a sentence is not a syntax tree?31

In 2004, amidst an increasingly watered-down usage of the word construction, Mirjam Fried and Jan-Ola Östman felt the need to clarify the original tenets of construction grammar as envisioned by Fillmore and his colleagues.12 A construction is, in fact, a multidimensional data structure that goes far beyond the limits of a syntactic tree, a property that has not sunk in yet for many scholars. For example, Stefan Müller, one of the most well-versed linguists that I have had the pleasure to disagree with, consistently uses the word phrasal instead of constructional when comparing construction grammar to the lexicalist approach23, which reveals that he assumes a dichotomy between lexical and phrasal accounts, whereas such a dichotomy is irrelevant for construction grammar. The late Ivan A. Sag went even further, redefining a construction as a local constraint between mother and daughter nodes1, which is such a radical return to phrase-structure grammars that it is even considered too strong by leading HPSG researchers…21 and they have Phrase Structure Grammar in their name!

The key to understanding a construction’s untapped potential is Fillmore’s third criterion: the fact that constructions can overlap with each other – which has more recently been echoed in the term Surface Generalization Hypothesis by Adele E. Goldberg.16 Constructions can freely combine with each other as long as there is no conflict and they can overlap while doing so. This is obvious for semi-schematic constructions such as the famous What’s X doing Y? construction,19 which overlaps with other constructions for handling their X and Y constituents; but it is equally true for parts of grammar that have always been considered as entirely schematic. For instance, alternative word orders in the examples Nina sent her mother a dozen roses and A dozen roses, Nina sent her mother involve the same lexical constructions and the same argument structure construction, but they differ with respect to which information structure constructions they interact with.15 It is unclear whether Fillmore himself would have favored using this expressive power for argument realization, but the fact is that a construction grammar can use the same lexical construction for the verb break to account for his three examples, which would be impossible in a purely phrasal approach.

Here we enter the dangerous territory because if you take this vision to its logical conclusion, you have to tear down many developments that have established themselves quite firmly in mainstream linguistics – and this can be quite upsetting for people who have worked for years on those developments. But in order to explore the logical conclusion of Fillmore’s dangerous idea, you need to formalize constructions, and I have not found a better explanation for why we need formalization than the following quote by Noam Chomsky in his famous Syntactic Structures:

“Precisely constructed models for linguistic structure can play an important role, both negative and positive, in the process of discovery itself. By pushing a precise but inadequate formulation to an unacceptable conclusion, we can often expose the exact source of this inadequacy and, consequently, gain a deeper understanding of the linguistic data. More positively, a formalized theory may automatically provide solutions for many problems other than those for which it was explicitly designed.”2

As a computational linguist, I can not stress enough how sobering the experience is to try and get your analysis to work in a formal and computational model. Even as trained scientists, our minds are inherently biased to confirming our analysis and ignoring its inevitable gaps and shortcomings, but a computational formalism is unforgiving to those gaps and forces you to go back to the drawing board and reconsider. However, equally as often, it is the formalism itself that is inadequate for accommodating your ideas, so you always have to remain skeptical in order to avoid shoehorning an analysis into an inappropriate formalism. Building the right tools for expressing your hypotheses is a never-ending scientific discovery process in and of itself, especially for a subject as intricate as language.

I believe this is where construction grammar has been struggling most.

Fillmore himself, inspired by Martin Kay’s Functional Unification Grammar,18 had clear intuitions about the formalization of his vision, which he mentioned in his fourth criterion: he saw the combination of constructions as a unification-based process. The original, Fillmorean construction grammar – which is now known by many as Berkeley Construction Grammar10 – did a laudable and for many purposes useful effort at formalizing constructions in this way, and particularly Paul Kay has been instrumental in this work. Further efforts included talks with Ivan A. Sag and his Stanford colleagues, who had been leading scientists in formal grammar through their work on HPSG.

HPSG is widely respected for its powerful formalism, so starting these talks among Bay Area linguists was only natural. However, the HPSG formalism is not really suited for construction grammars. For one thing, HPSG has largely traded unification for typed feature constraints since 1994.25 Moreover, the HPSG formalism has been developed to support complex phrase-structure grammars, and therefore makes formal choices that are difficult to match with Fillmore’s four criteria. The outcome of the talks between Berkeley and Stanford, called Sign-Based Construction Grammar (SBCG),20 therefore seems to be an example of how the constraints of a formalism have prevailed over insights of the theory, with SBCG essentially being a construction-inspired dialect of HPSG rather than a formalization of Fillmore’s ideas.29 Which is not to say that SBCG does not bring an interesting approach to the table, as shown by the work of a.o. Laura Michaelis.

Fortunately, Fillmore’s influence reached very far indeed. In the early 2000s, AI researcher Luc Steels was developing computational experiments on how grammatical structures may emerge from scratch in a population of language users. For one of his most groundbreaking experiments,26 Steels realized that there was no appropriate formalism that would allow him to conduct these experiments because he needed one that would be able to handle all the stages of language emergence, ranging from single words to natural language-like grammars, and all stages in-between. Steels therefore developed a new formalism for which he drew a lot of inspiration from Fillmore’s Case Grammar.7 He and his colleagues, including yours truly, soon realized that construction grammar was the linguistic approach that matched most closely with their own vision, and went to great lengths to make the formalism a suitable computational platform for exploring constructional language processing – including the possibility to adhere to all of Fillmore’s criteria. That formalism was then baptized Fluid Construction Grammar (FCG).27,28

While formalizing my ideas using this platform, I personally got to experience that Fillmore’s idea is indeed dangerous. Since I have an affinity for cognitive-functional linguistics, I have always been frustrated with mainstream accounts of long-distance dependencies as found in WH-questions such as What did you see? Mainstream linguistics literally treats such dependencies as if an element (in this case: what) is “extracted” from its supposed original argument position (here: following the verb) and then repositioned. Such analyses require formal machinery such as detecting that extraction took place, percolating information along the syntactic structure in a stepwise fashion, filler-gap rules for creating new slots, and mechanisms for changing the subcategorization frame of the verb. I argued in one of my papers that constructions allow you to eliminate all of those mechanisms, a claim that I substantiated with a proof-of-concept implementation in FCG.30 The paper subsequently attracted a complaint letter.

Two other papers of mine have received complaints as well, and each time the author of the letter requested to remain anonymous. I blame myself in part for those letters: I always try to write in an engaging style to keep my readers interested, and in my quest for colorful language I sometimes inadvertently end up including brash sentences in my articles. At the same time, you’re supposed to write a counter article if you disagree with someone in science, and Stefan Müller is an example of a sincere scholar who has done exactly that.22 The anonymous author(s) of those letters, however, declined to do so after being offered the opportunity by the journals that, luckily, rose up to the occasion and defended my work. I am now convinced that much of this hostility comes from the fact that once you are confronted with what a construction grammar can do, in a formally or computationally precise way, you at least have to entertain the possibility that many developments in mainstream linguistics might have been built on sand.

I am convinced that Fillmore’s vision of constructions will eventually be fully understood in the linguistics community and that many discoveries will be made after tapping the full potential of construction grammar. Important work is already being carried out in his legacy, as Fillmore himself acknowledged, including “Mirjam Fried, Seiko Fujii, Adele Goldberg, Jean-Pierre Koenig, Knud Lambrecht, Yoshi Matsumoto, Laura Michaelis, Kyoko Ohara, Toshio Ohori, Jan-Ola Östman, Eve Sweetser, and several others.”  Bill Croft‘s take on constructions3 is some of the most exciting work I have ever read. As for Fluid Construction Grammar, I have never had the honor and pleasure to meet Chuck Fillmore in person, but I like to think that he would have approved of our work and would have been thrilled to see constructions at work.

Acknowledgements

Even though this essay presents my personal view on “constructions”, it benefited from insightful feedback from Bill Croft, Mirjam Fried, and Adele Goldberg, who have shared their perspective and their understanding of constructions, and whose work has also been seminal in forming my ideas about language. Many thanks also to Stefan Müller, who has offered an open review on Twitter, you can follow the discussion, including also Laura Michaelis, here and on Stefan’s Twitter account.

Credits

Further Reading

  1. Hans C. Boas and Ivan A. Sag, editors. Sign-Based Construction Grammar. The University of Chicago Press, Stanford, 2012.
  2. Noam Chomsky. Syntactic Structures. Mouton, The Hague, 1957.
  3. William Croft. Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford UP, Oxford, 2001.
  4. Ferdinand de Saussure. Cours de Linguistique Générale. Payot, Paris, sixth, revised, 2005 edition, 1916. Edition Critique Préparée par Tulio de Mauro, first published in 1967.
  5. Charles J. Fillmore. Frame semantics and the nature of language. In S. Harnad, H. Steklis, and J. Lancaster, editors, Origins and Evolutions of Language and Speech. New York Academy of Sciences, New York, 1976.
  6. Charles J. Fillmore. The mechanisms of “Construction Grammar”. In Pro- ceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society, pages 35–55, Berkeley CA, 1988. Berkeley Linguistics Society.
  7. Charles J. Fillmore. The case for case. In Charles J. Fillmore, editor, Form and Meaning in Language, Volume 1: Papers on Semantic Roles, pages 23–122. CSLI Publications, Stanford, 2003. (First printed in 1968).
  8. Charles J. Fillmore. Form and Meaning in Language, Volume 1: Papers on Semantic Roles. CSLI Publications, Stanford, 2003.
  9. Charles J. Fillmore. Encounters with language. Computational Linguistics, 38(4):701–718, 2012. URL https://doi.org/10.1162/COLI_a_00129.
  10. Charles J. Fillmore. Berkeley Construction Grammar. In The Oxford Handbook of Construction Grammar, pages 112–132. Oxford University Press, Oxford, 2013.
  11. Robert Freidin. A brief history of generative grammar. In Gillian Russell and Delia Graff Fara, editors, The Routledge Companion to Philosophy of Language. Routledge, Abingdon, 2012. doi: 10.4324/9780203206966.ch7 8.
  12. Mirjam Fried and Jan-Ola Östman. Construction grammar: A thumbnail sketch.In Mirjam Fried and Jan-Ola Östman, editors, ConstructionGrammar in a Cross-Language Perspective, pages 11–86. John Benjamins, Amsterdam, 2004.
  13. Adele E. Goldberg. A unified account of the semantics of the English ditransitive. In Proceedings of the Fifteenth Annual Meeting of the Berkeley Linguistics Society, pages 79–90, Berkeley, CA, 1989. Berkeley Linguistics Society.
  14. Adele E. Goldberg. A Construction Grammar Approach to Argument Structure. Chicago UP, Chicago, 1995.
  15. Adele E. Goldberg. Constructions At Work: The Nature of Generalization in Language. Oxford University Press, Oxford, 2006.
  16. Adele E. Goldberg, Devin M. Casenhiser, and Nitya Sethuraman. Learning argument structure generalizations. Cognitive Linguistics, 15(3):289–316, 2004.
  17. Aravind Joshi. How much context-sensitivity is necessary for characterizing structural descriptions. In David R. Dowty, Lauri Karttunen, and Arnold M. Zwicky, editors, Natural Language Processing: Theoretical, Computational, and Psychological Perspectives, pages 206–250. Cambridge University Press, New York, 1985.
  18. Martin Kay. Functional grammar. In Proceedings of the Fifth Annual Meeting of the Berkeley Linguistics Society, pages 142–158. Berkeley Linguistics Society, 1979.
  19. Paul Kay and Charles J. Fillmore. Grammatical constructions and linguistic generalizations: The what’s x doing y? construction. Language, 75:1–33, 1999.
  20. Laura A. Michaelis. Sign-Based Construction Grammar. In Thomas Hoffman and Graeme Trousdale, editors, The Oxford Handbook of Construction Grammar, pages 133–152. Oxford University Press, Oxford, 2013.
  21. Stefan Müller. Grammatical Theory: From Transformational Grammar to Constraint-Based Approaches. Volume  1 in Textbooks in Language Sciences. Language Science Press, Berlin, 2016.
  22. Stefan Müller. Head-Driven Phrase Structure Grammar, Sign-Based Con- struction Grammar, and Fluid Construction Grammar: Commonalities and differences. Constructions and Frames, 9(1):139–174, 2017. doi: 10.1075/cf. 9.1.05mul.
  23. Stefan Müller and Stephen Mark Wechsler. Lexical approaches to argument structure. Theoretical Linguistics, 40(1–2):1–76, 2014.
  24. Jan Nuyts. Aspects of a Cognitive-Pragmatic Theory of Language. On Cognition, Functionalism, and Grammar. John Benjamins, Amsterdam, 1992.
  25. Carl Pollard and Ivan A. Sag. Head-Driven Phrase Structure Grammar. University of Chicago Press / CSLI Publications, Chicago/Stanford, 1994.
  26. Luc Steels. Simulating the evolution of a grammar for case. In Proceedings of the Evolution of Language Conference, Harvard, April 2002.
  27. Luc Steels. Constructivist development of grounded construction grammars. In Walter Daelemans, editor, Proceedings 42nd Annual Meeting of the Association for Computational Linguistics, pages 9–19, Barcelona, 2004.
  28. Luc Steels, editor. Design Patterns in Fluid Construction Grammar. John Benjamins, Amsterdam, 2011.
  29. Remi van Trijp. A comparison between Fluid Construction Grammar and Sign- Based Construction Grammar. Constructions and Frames, 5:88–116, 2013.
  30. Remi van Trijp. Long-distance dependencies without filler-gaps: A cognitive- functional alternative in Fluid Construction Grammar. Language and Cognition, 6(02):242–270, 2014.
  31. Remi van Trijp. Chopping down the syntax tree: what constructions can do instead. Belgian Journal of Linguistics, 30(1):15–38, 2016.

Leave a Reply

Your email address will not be published. Required fields are marked *