The Turing test: Human or machine?

Kim Matsunaga

Imagine being told you would be playing online chess games against World Chess Champion Vladimir Kramnik and Deep Fritz, IBM’s latest chess supercomputer. Do you think you could tell which was which? Perhaps both games would end too quickly to make any such determination, but the question being posed is: can a human be fooled by a machine?

This is the concept behind the Turing test, developed by English logician and mathematician Alan Turing in 1950, to test for intelligent behavior of a computer algorithm. In the test, a human judge, engaging in wide-ranging conversation, attempts to distinguish whether he or she is interacting with another human or with a computer imitating human responses.

Now, imagine that HAL, the intelligent computer in Stanley Kubrick’s 1968 film, 2001: A Space Odyssey, replaced you as judge. Do you think HAL could accurately detect whether its opponent was Kramnik or Deep Fritz? In other words, can a machine be programmed to distinguish the subtleties between natural human behavior and a sophisticated computer mimicking human behavior?

The same ideas can be used to develop and measure how good a social-scientific theory of human behavior is. Caltech’s Jasmina Arifovic, visiting associate professor, and the late Richard McKelvey, Wasserman Professor of Political Science (see far right column), have pointed out that the development of social-science theories can be likened to the task of building a computer to mimic human behavior, or equivalently, to building a computer that will pass the Turing test
in the range of behavior covered by the theory. Thus, social science can be deemed to be successful when it is no longer possible for a computer judge to tell the difference between behavior generated by humans and that generated by the theory (i.e., by a machine).

Based on the above ideas, Caltech researchers this summer plan to run a two-sided computer tournament, the Turing Tournament, to try to simultaneously develop strong models of human behavior, and good ways of telling the difference between human and machine behavior. Arifovic and postdoctoral scholar Svetlana Pevnitskaya will apply this initially to the question of developing theories for how subjects play a repeated, two-person matrix-form game.

In the tournament, Caltech will solicit computer programs that can mimic human behavior, called emulators. Also solicited will be computer algorithms, called sniffers, designed to detect whether the observed behavior is generated by humans or by machine. After all entries are received, repeated rounds of a simple, matrix-form game will be played by humans and by emulators. The data generated from these rounds will be then presented to the sniffers, whose task it will be to determine whether data are human- or machine-generated. The winning sniffer will do the best job of distinguishing between the human and machine data, and the winning emulator will do the best job of fooling the best sniffer. Monetary prizes for the best emulators and sniffers will encourage the submission of entries representing the best current thinking on these questions.

The Turing Tournament raises funda-mental, unsolved issues in game theory, computer science, econometrics/statistics, and experimental economics. Applications of this methodology include monitoring “program trading” in financial markets, modeling behavior in public-goods problems, evaluating machine-translation programs, and building decision-making robots to take the place of humans in economics experiments. Some of these topics will be the focus
of the Turing Tournament in future years.

One particularly fertile area is the question of program trading—automatic computerized execution of securities trades, usually in large volumes—which tends to create very unstable situations. The Securities and Exchange Commission (SEC) has dealt with the problem by introducing market mechanisms such as “circuit breakers” to temporarily slow down or stop trading when prices become too volatile. However, these remedies introduce their own inefficiencies into the market. The Turing Tournament methodology could be used instead to provide a way to detect instability caused by program trading, possibly leading to more effective computer-based means by which the SEC can regulate it.

Another fruitful area of study is experimental economics. Here, with good models of human behavior in a voting setting, decision-making robots could be used in place of humans in experiments on candidate competition, to model voters’ responses to candidate behavior. This would allow experiments on candidate behavior in large elections without having to pay thousands of subjects to play the part of the voters. Instead, the only subjects needed would be the candidates.
Turing Tournament organizers envision the event running for five years, beginning this summer. This year’s tournament will focus on repeated games, and applications for subsequent years will be identified as the program evolves.

Kim Matsunaga is a staff writer in the Division of the Humanities and Social Sciences.