Full paper: “Words are Malleable: Computing Semantic Shifts in Political and Media Discourse” has been accepted at CIKM2017

Our full paper has been accepted at CIKM2017:

Hosein Azarbonyad, Mostafa Dehghani, Kaspar Beelen, Alexandra Arkut, Maarten Marx, and Jaap Kamps, “Words are Malleable: Computing Semantic Shifts in Political and Media Discourse”

 

Here is a short summary of the paper:

Words are always ‘under construction’, their meaning is unstable and malleable. Semantic fluctuations can result from a concept’s ‘essentially contested’ nature. “What does democracy mean?” or “what values are democratic?”. The answer changes according to the ideological perspective or viewpoint of the person uttering the term. Equally important is the influence of historic events. The understanding of ‘terrorism’, for example, has significantly changed as a result of the 9/11 attacks. Currently, only a few studies have attempted to compute the malleability of meaning and monitor semantic shifts. Most (if not all) of these approaches focused their efforts to uncovering change over time. However, there are other valuable dimension that can cause semantic shifts in the meanings of words such as social or political variability.

As an example, Figure 1 shows the semantic shifts in the meaning over two dimensions: time and political context, i.e. membership of a parliamentary party at the British House of Commons. The speeches given by the members of each party are used for constructing their corresponding semantic spaces. This can be extended to social parties or groups of like-minded people in social forums such as Facebook. The first example in the figure (the word “moral”) shows that a semantic shift can occur over time and across different contexts. However, as the second example shows, although the meaning of a word (such as “democracy”) can stay stable over time, it can still differ between certain groups. Therefore, social context is another valuable dimension that can explain semantic shifts in the meanings. In this paper, we explore the semantic stability of words by computing how contextual factors, such as social background and time, shapes—or, at least reflects—shifts in meaning.

Figure 1: Visualization of semantic shifts in meaning of words “democracy” and “moral” over time and along Conservative and Labour parties in the UK parliament. (a) The meaning given by Labours to “moral” is shifted from a “philosophical” concept to a “liberal” concept over time. In the same time, the meaning of this word is shifted from a “spiritual” concept to a “religious” concept from Conservatives’ viewpoint. Moreover, two parties gave a very different meanings to this word. (b) The meaning of “democracy” is stable over time for both parties. However, Conservatives refer to democracy mostly as a “unity” concept, while Labours associate it with “freedom” and “social justice”.

We first use distributional semantic approach and generate embedding spaces from categorized corpora, where a category can be a certain context (such as speeches given by a political party). In the example given in Figure 1, there are two categories: Conservative and Labour parties. Then we propose different approaches to compare the vector representation of specific words between spaces. The challenging part of this task, and the main contribution of this paper, is to develop techniques that compare vectors across spaces with different dimensionality structures. We compare the meaning of concepts between the two embeddings and establish whether or not it has changed in meaning.

We consider three ways for comparing meaning across vector spaces:

  1. We create a linear mapping between two embedding spaces, project words from one embedding space to the other and measure whether the projected word lands closely to the word in the other space.
  2. For each viewpoint, we construct a graph such that the nodes are words and edges are the similarities between them. Then, using graph based similarity measures we compute how similar the neighbors of a word in two embedding spaces are.
  3. We define a measure that combines these two measures.

In this work our main research problem is to study how semantic shifts in words are happening not just over time dimension but also social dimension, quantify the rate of shifts, and explore the applications that can benefit from the information about shifts. Our main contributions are:

  • We show that semantic shifts not only occur over time, but also across different viewpoints in a short period of time.
  • We improve the linear mapping approach for detecting semantic shifts and propose a graph-based method to measure the rate of semantic shifts in the meanings of words.
  • We employ word stability measures in contrastive viewpoint summarization and document classification and extensively evaluate our proposed approach to these tasks.
  • Our analysis shows that the laws of semantic changes (Conformity and Innovation) which was shows to be hold for shifts over time also hold for semantic shifts across viewpoints. These laws state that frequent words are less likely to shift meaning while words with many senses are more likely to do so.  Moreover, we introduce a new law of semantic changes (Concreteness) which implies that abstract words have higher rates of semantic shifts compared to concrete words.
  • We make the evaluation dataset for detecting semantic shifts and contrastive viewpoint summarization publicly available.