Tuesday, December 10, 2013

A leap in artificial intelligence ? Continued.

About a month ago I wrote the first part of my thoughts about how to make the so much wanted leap in artificial intelligence.
Today I feel the need to release the next part.


Ever read a book? You probably did.
Ever speaking the text out loud? Perhaps when you were a kind.
Ever silently spoke it in your head? I did and do (even when writing this). And you?

Why?


Do you remember the time when your mother or father told you a story before going to sleep?
You probably don't remember what happened in your mind at those moments (the recall of our memories is not supposed to go that far back in time).  But imagine what might have happened.
I think I made it into a film. Loosing part of what was told as my internal story follows its own scenario, and merging both stories again a few moments later.
Ever lost attention during a presentation? What happened?

Why?


The "man in the box" jumping out or a suddenly moving living statue induces strong reactions.

Why?


Look at this question: r thr n tp's lft n th txt y wrt?

Why?


Before you will be attempting to answer these four why questions, a quick tour on our actual AI techniques.
Most (if not all) machine learning techniques are based on a simple idea:
- take a set of similar things (text, images, ...), labelled or not
- do some calculations with them (statistics, neural network training, ...)
- apply the results to another set of similar things


My previous post was centered around the idea of parallel treatment of different kind of things and associating these:
- written and spoken word
- image of an object and the word for it
etc.

But do we learn from static things?

Our environment is all but static. Although there are static objects in it, we aren't. Which makes the whole dynamic.

When we started learning as a baby, quietly observing our world from our cradle, what do we see? Things that don't move and things that do. Some things move when you touch them. Everything moves when you're lifted out.

I think our capability of very quickly identifying static things comes from our experience in identifying things in a dynamic environment. Not the other way around.
Our mind is trained to recognize things in a stream of information. And when provided with static things (images in our childhood book e.g.) we make them dynamic.
When looking at a painting of a still live do you see it in 2D or do you make it a 3D and sort of feeling the objects?

Now return to the why's. Apply the need for a stream of information to them. Expecting things to move or not.
Does it seem to make sense?

The typo's example might be the most difficult. But if you look at a piece of text as a flow of words (we don't read the individual characters). Anticipating the next words. Filling in the lacking words, replacing the wrong ones, not seeing the misspellings. (btw: did you pronounce the question?)
A real life experience I had several years ago is worth mentioning. For a knowledge management project we did several interviews. They were all recorded. During the transcription we discovered that one person, during the 90 minutes interview, never had finished a sentence. We didn't noticed that until the transcription. We must have made up the end of the sentences while he was talking.

Back to our artificial intelligence.

Imagine another kind of machine learning. One that learns from a flow of information. Looking at the delta's between two moments in time (video, sound but also text).
Instead of learning cat (and human) faces from independent still images it will learn what cats are from sequences of images. Their 3D appearance, their degrees of liberty, how they move etc. Probably even reaction patterns. Once this is acquired it is easy to "create" a front view: the face of the cat.
Objects that do not change in position can be viewed with a moving camera (or more then one camera).
For images the base techniques exist, nothing new. We only have to ...

use it in our actual artificial intelligence toolbox. This toolbox might be (almost) good enough but we should apply it for learning different things (and tune a few things).
Summarized this might give


Artificial intelligence that learns from associated parallel streams of information.


Does your mind starts wandering, thinking perhaps about the "how to"?

Fine.

Sit back and relax.


First chapter Next chapter


2013-12-11 Edit: typos