CSE 2009: This Is Your Brain on (Input Word X)June 15, 2009
Imagine brain-imaging technology that could tell, in real time, whether a student is really learning a new concept, or perhaps a new word in a foreign language. When the student doesn't understand what's being taught, moreover, the technology would be able to pinpoint the cause of the confusion.
For such technology to become a reality, of course, researchers would have to know quite a bit about how the human brain stores information. What happens in the brain when a person concentrates on a given word? Does the brain store different types of words differently? Does neural activity for a concrete noun like "chair" differ from that for an abstract concept like "love"?
As it turns out, researchers are well on their way to achieving such an understanding, thanks in part to the work of Tom Mitchell, a computer scientist at Carnegie Mellon University. Mitchell, whose background is in artificial intelligence and machine learning, is already well known in the computer science and engineering community (and beyond) for his work in "thought identification": As he and colleague Marcel Just discussed on "60 Minutes," they've developed an algorithm that's able to determine what word a person is thinking about at a given moment, using brain images obtained via functional magnetic resonance imaging. Mitchell's latest research, which he described in an invited talk at this year's SIAM Conference on CSE, goes one step further.
Functional MRI creates an image based on blood flow in the brain---in essence, a snapshot of brain activity. Such imaging technologies, Mitchell says, have been game-changers for fields ranging from neuroscience to psychology to computational linguistics. For his group, fMRI has offered the opportunity to test theories from all those fields, and some of their results have the potential to propel further research in them.
A theory from computational linguistics, for example, suggests that the brain determines and stores the meaning of a given word based on the words or groups of words with which it frequently occurs in typical text. Mitchell and his colleagues have created a computational model based on this theory, with help from a somewhat unlikely source: Google.
As the most popular search engine on the Web, Google has trillions of pages in its index and is thus uniquely positioned to gather statistical data about Web content. One of its many collections consists of statistical data for a trillion-word assortment of passages of Web text. Initially released by Google to aid in machine-translation projects, the collection includes such statistics as the number of times a given word appears with another word, or with a particular phrase.
Mitchell and his colleagues made that collection the basis for their computational model, which they originally designed to accept concrete nouns. They began by developing a list of 25 verbs---verbs like eat, touch, hear, smell, and so on, based on hypotheses from neuroscience suggesting that neural representations of words might be grounded in the sensory–motor regions of the brain. Using the Google data to determine the verbs that occur most often with a particular input noun, the model assigns a weight to each of the 25 verbs, producing a feature vector that essentially characterizes the input word.
To predict brain activity for a word---"banana," say---Mitchell's model begins by looking up the word in the Google data. Because "banana" is often found near the verb "eat," the model assigns a high weight to "eat," contributing to the prediction of a high level of activity in the area of the brain that deals with taste. Conversely, finding "banana" very infrequently with the verb "hear"---bananas not being known for making noise---the model assigns a much lower weight to "hear," contributing to the prediction of a low or nonexistent level of activity in the area of the brain that deals with hearing.
As the process continues, the model breaks those predictions down to localize activity in more specific areas of the brain---individual voxels, essentially three-dimensional pixels. Using trained data, the model predicts activity at a particular voxel by calculating a sum of the activity for each of the 25 verbs. A learned parameter that designates how much a given verb contributes to activity at a particular voxel is used to weight the sum.
Modeling on a Laptop
Last year, when a paper by Mitchell, CMU psychology professor Marcel Just, and other collaborators appeared in Science, the group had trained the model with just the original 25 verbs. It can now use up to 485 verbs and on the order of 10 million parameters, but it can still run on a basic laptop and predict brain activity for a given word within a few seconds. That impressive performance is a result of the group's efforts to keep things simple. In particular, the model uses only a linear dependence between semantic features (that is, the weights assigned to individual verbs) and particular locations in the brain.
"If you think about the computational power that went into [this] whole thing, most of it was on [Google's] side," Mitchell says. As for the model itself, he says that if he wants to test its accuracy with a new list of verbs, "I can retrain it on my laptop in a few hours."
The rapid retraining ties in with another way in which the model has been successful. By leaving out brain activity scans for a few words when training the model, Mitchell says, the researchers can see how it performs when given words it's never seen. In general, Mitchell considers the results promising: Even for words outside the model's training set, the predicted brain activity is comparable to observed fMRI data.
Interestingly, the model's predictions are typically most accurate in the left hemisphere of the brain. This ties in with existing theoretical results in neuroscience: In the semantic representation of words, the left hemisphere is believed to play a greater role than the right.
As to fears that work of this type could provide access to people's private thoughts, Mitchell points out that a major requirement for his model is a cooperative subject. For an accurate image, he says, a participant must strongly focus on a given word. He does not discount the relevant ethical issues, however.
"As with any potentially powerful technology, there are ways to use it that we'd like to see, and ways we'd not like to see," he says. "I'm strongly in favor of having [a discussion about that] now, when the technology is still young, and there's no immediate threat."
Je ne comprends pas
Mitchell and his group are now looking into potential applications of the research, beyond the commonly considered uses in lie detection and interrogation. In one project, working with researchers at the University of Pittsburgh Medical Center, they are investigating use of the technology to help individuals who are able to think but unable to communicate physically. Mitchell envisions a model similar to the one described here, but with fMRI replaced by electroencephalography; EEG, which makes use of electrodes placed on the scalp, as opposed to having the patient lie in a scanner, is easier for both patients and medical personnel to use.
"Our fMRI machine weighs over 10 tons and costs millions of dollars. To use it you need several people to run the machine and you need to lie completely still inside the machine," he explains. "In contrast, EEG is pretty inexpensive and you can wear it as you move about . . . [It's] portable, inexpensive and might be used in this way if we can do with EEG the same thing we do with fMRI signals."
He also sees potential applications in the fields of education and training---perhaps in a "smart" educational system like the one mentioned earlier. He gives the example of learning a new language.
"I speak flawed French," he laughs. "So if we really could use these methods to see what you're thinking, then when I'm listening to a French tape or trying to read a passage, you can see that I didn't get this word, I didn't hear it or understand it in my brain." While a person may not know exactly what it is that's confusing him, Mitchell says, the model could identify the precise word or phrase that's causing problems.
Mitchell doesn't plan to stop with concrete nouns. He and his group have already used the model to predict activity for abstract words as well. "We collected some fMRI data on words like ‘democracy' and ‘anxiety' and ‘love,' and we found that we could make the model work even on abstract words if we expanded the [original set of] verbs," he says.
So watch for the headline: Computer Scientist Determines Real Meaning of "Love."
Michelle Sipics is a contributing editor at SIAM News.