Can Winograd Schemas Replace Turing Test for Defining Human-Level AI?
Like Turing, we believe that getting the behaviour right is the primary concern in developing an artificially intelligent system. We further agree that English comprehension in the broadest sense is an excellent indicator of intelligent behaviour. Where we have a slight disagreement with Turing is whether a free-form conversation in English is the right vehicle. Our WS [Winograd schemas] challenge does not allow a subject to hide behind a smokescreen of verbal tricks, playfulness, or canned responses. Assuming a subject is willing to take a WS test at all, much will be learned quite unambiguously about the subject in a few minutes. What we have proposed here is certainly less demanding than an intelligent conversation about sonnets (say), as imagined by Turing; it does, however, offer a test challenge that is less subject to abuse.