How is it that computers can be so highly functional across countless deterministic tasks yet have difficulty performing functions a child could easily do? If a child is given a a collection of photographs which contain the same dog, they would have no problem pointing out this fact. A camera communicating with an interpreting program, however, is bounded by information represented by the pixels. Algorithms can be written to try to find large scale changes across a screen of pixels, such as a “blob” being inside of the image, but determining if there is a dog is nearly impossible.
Even if countless images of dogs are shown to a computer and saved to its memory, it would be incapable of determining whether a new image contained a dog unless the new image was extraordinarily similar to an old image. Any transformations to the image such as rotation, translation, the dog itself being in a different position, etc. make the image unrecognizable to most computer interpreting programs.
An example of this problem of deterministically interpreting images is seen in the use of CAPTCHAs. CAPTCHAs are images that contain text that has been somehow distorted or squished together to make the letters inconsistent. CAPTCHAs are often used in popular websites where the host would like to avoid having programs send them spam comments. The CAPTCHAs’ purpose is to have a person read the unusual text and then type it and send it – thus, proving they are not a program. People are extremely adept at reading all kinds of text nearly regardless of how distorted the text is and therefore, generally pass the test easily. Because most sophisticated visual computer programs determine what is in an image by finding edges or blobs of pixels, it is almost impossible to read inconsistent text because there is not enough information for the computer to use to understand the message.
However, a process called a digital neural network can be programmed to make a computer “learn” information. A digital neural network (which I’ll abbreviate as DNN) works as an analogy to how people’s brains function when thought of as neurons connected by axons. Information is fed to the DNN as numbers and stored in memory where it will later be used for reference. When this process is repeated stronger connections are made to memories that are similar to one another. In reference to the CAPTCHA test, CAPTCHA text would be stored one letter at a time (by a human programmer) in memory and many examples would be fed to the program building it’s memory. Then, an interpreting program using a DNN would be able to analyze new CAPTCHA images by separating letters into “blobs” and determining what each blob letter is by comparing it to its memory of previous letters. This method would work for CAPTCHA text as long as the font did not change. The learning process would have to be repeated if the CAPTCHA text was significantly different. DNNs are not specific to image interpretation, they can be used to “teach” computers to do all kinds of human like functions including possibly passing a Turing test.
Other Resources:
Turing Test Information: Turing Test
CAPTCHA Information: CAPTCHA
DNN Information: Digital / Artificial Neural Network
Article Written by: Alex Simes