Erik Pukinskis

Simulating language emergence

Note: I don't know anything about this topic. I don't really believe what's below is insightful, I am just getting my thoughts out so I have a starting point to start reading from.

I'd really like to create an extremely basic simulation of language emergence, or failing that, language use. And when I say extremely basic, I mean extremely basic. The test for success would be that one agent's predictive ability is improved by some communication from another agent.


Sidebar

I am suggesting improved prediction as the key result of effective language use because I think prediction (internal simulation) is fundamental to thought and to langauge use (see Foundations of human-computer semantic transfer) though I don't have much support for that thesis yet. In fact, this experiment is partially meant to be a test of that thesis.

Perhaps this is circular reasoning though. I am suggesting that internal simulation is central to language use by showing that it is necessary in a model which is evaluated based on predictive ability. It certainly seems weak.

Perhaps the measure of linguistic success should be information transfer. Information transfer, though, implies improving the ability of the listener to perceive affordances in their environment, which I don't think is very far flung from improving predictive ability. This certainly requires some more thought.

It should also be noted that simple cries of distress improving predictive ability, and it's not certain whether such cries really entail langauge. Still, it's a start.


The question becomes, what sort of agent is capable of passing our test? First, it has to be able to encode and emit information, and second it has to be able to make use of it. Let's ignore for the moment encoding, and consider making use.

An agent which trains a network model with it's experiences and tests the model to predict future events, while simultaneously training the model with data from a communication channel, may be able to extract information from that channel. The success of this process depends on:

  1. The encoding of information doesn't destroy the information
  2. The encoding isn't so noisy the network can't extract any information
  3. The information somehow "fits" or is compatible with this process

The third requirement is weak, and probably needs to be reworded/substantiated, but it'll do for now. What kind of encoding agent might meet these requirements? There are perhaps two sub-tasks, if I may be so bold to separate them: choosing what to say, and how to say it. If my hunch that language is steeped in simulation is correct, then an agent might simulate what sort of information *it* would want in the current situation, and encode that.

Assumptions

The assumptions I make above are probably more like bank robbery than a construction mortgage, so I must at the very least try to enumerate them:

  1. I am assuming language-using agents do internal simulation of potential events
  2. I am assuming language understanding uses a network model
  3. I am assuming language understanding can be funtionally separated from language production
  4. I am assuming effective language use can be measured by increased predictive ability
  5. I am assuming that the processes of choosing what to say and how to say it are separable
  6. I am completely fabricating my account of both encoding and decoding.

Further thoughts

The second step in this process is to try to figure out how more language-like communication emerges. What are things that differentiate basic communication "AAAH!" from language "a grue is kicking me":

Note that context sensitivity (pronoun dereferencing, etc) isn't really specific to language-like communication. A scream in the ocean means something completely different than a scream during a comedy show. This sort of thing is left for the listener to work out, and isn't dependant on the language (though features of the language might bias interpretation).

So the next question is, how do these features relate to our experience of the world? If we do perceive only affordances in our environment, or even if we just say we perceive features, the compositionality is something very natural to us. If language features map one-to-one to perceptual features, then we can use the perceptual system to understand language. Even if there is no mapping, we could still process language features the same way we process perceptual features.

Ambiguity resolution is also something we do in perception. Optical illusions demonstrate this.

So, if this is the case... if language is highly dependant on having perceptual machinery like this, then why not try to build a computer that does similar things when it perceives?

What does that really mean?


 
This page was last updated May 19, 2004 at 4:28pm.