### Classic papers defining compositionality:

- Timothy van Gelder and Robert Port (1993), Beyond Symbolic: Prolegomena to a Kama-Sutra of Compositionality. https://pdfs.semanticscholar.
org/a8fc/ c4e9a5b89b6b3e96c78957667534cb 2cfe16.pdf - Peter Pagin and Dag Westerståhl (2010), Compositionality I: Definitions and Variants
Philosophy Compass 5/3 (2010): 250–264, 10.1111/j.1747-9991.2009.00228.x
http://home.hib.no/prosjekter/easllc2012/docs/westerstahl/2010Compass1.pdf

- Peter Pagin and Dag Westerståhl (2010), Compositionality II: Arguments and Problems, Philosophy Compass 5/3 (2010): 265–282, 10.1111/j.1747-9991.2009.00229.x http://home.hib.no/prosjekter/easllc2012/docs/westerstahl/2010Compass2.pdf
- Janssen, Theo MV, and Barbara H. Partee (1997), Compositionality. In:
*Handbook of logic and language*. North-Holland, 417-473, https://eprints.illc.uva.nl/id/document/2671 - Christoph von der Malsburg (1997), Binding in models of perception and brain function, Current Opinion in Neurobiology, Volume 5, Issue 4, August 1995, Pages 520-526, https://www.sciencedirect.com/science/article/pii/095943889580014X#aep-bibliography-id5

https://arxiv.org/abs/1904.00157

Of our work, the most relevant is:

R. Thomas McCoy, Tal Linzen, Ewan Dunbar & Paul Smolensky (2019). RNNs implicitly implement tensor product representations. International Conference on Learning Representations (ICLR).

This is the paper that converted me to connectionism back in 1997 or so: https://doi.org/10.1207/s15516709cog1403_2

And this is my own contribution to the debate on compositionality/systematicity in neural networks: http://stefanfrank.info/pubs/SemSyst.pdf

relevant ICML paper this year on learning compositional representations:

https://arxiv.org/abs/1811.12359

what I’m thinking about these days:

https://arxiv.org/abs/1707.08139

https://arxiv.org/abs/1902.07181

I’m very interested in how recombinable pieces can be learned from full, unanalyzed utterances, like the sort of stuff that Jacob is working on in the second of the 3 papers he mentions. Another example of this sort of task, though older, is the following:

https://nlp.stanford.edu/pubs/monroe2016color.pdf

I’d be happy to receive recommendations for other, more recent papers in a similar vein.

Composition at the lexical level, or the level of pairs of concepts: This is old stuff, but still challenging.

Kamp and Partee, Prototype theory and compositionality: http://semantics.uchicago.edu/kennedy/classes/s06/readings/kamp-partee95.pdf

James Hampton, The combination of prototype concepts: https://www.academia.edu/2816472/The_combination_of_prototype_concepts

Some of my work on improving hierarchical generalization in language modeling that includes explicit composition operations:

https://arxiv.org/abs/1602.07776 (Recurrent Neural Network Grammars)

https://arxiv.org/pdf/1611.05774.pdf (analysis of what they learn, showing that a composition operation is essential)

https://arxiv.org/abs/1904.03746 (learning them without supervision)

Some of my work on learning recombinable units:

https://arxiv.org/abs/1811.09353 (unsupervised word discovery and grounding)

Fascinating anti-compositional perspective from Ramscar:

https://psyarxiv.com/e3hps/

I think the ideas in Ramscar’s paper need to be taken very seriously, even though I read some of his collaborative work with Dye (sometime last year?) and remember not being terribly impressed. I certainly think discriminative learning is important, but none of the people advocating this seem to be asking deeply enough about what exactly is being learned.

I was not aware of this new work. Concerning their discriminative learning stuff, it seems to me that the model they propose has the power of a single-layer linear perceptron, so I don’t know how they can seriously think they can go far with that. On the other hand, I do think it’s useful to think of alternatives to old ideas about compositionality.

Here are three recent papers from our group:

SCAN challenge for compositional seq2seq learning (e.g., “jump around right twice and walk thrice”), demonstrating that neural networks have serious issues with systematic generalization.

https://arxiv.org/abs/1711.00350

In contrast, people can generalize compositionally in novel domains

https://arxiv.org/abs/1901.04587

Neural networks can acquire compositional skills through meta seq2seq learning

https://arxiv.org/abs/1906.05381

I think we tend to have different things (bearing some family resemblance) in mind when we talk about compositionality. It would be very nice if we all came to the workshop ready to define exactly what we think compositionality means, and why we think it’s a crucial issue, not only for our respective fields, but for a better understanding of humans or improvement of machines in general.