Turing Test Proposals

Are we making a joint effort at this? Are we making an ETARC submission, or are we just sending in our own ideas individually?

Because if we’re going to make a collective test paper, we need to start collating ideas, and voting on things.

And we need more ideas. Original ones.

I sent in 16 questions. :wink:

As the ETARC CSD one of us could send a joint effort test. Some have already sent their own. It would be cool if ETARC’s test was chosen.

The above list of questions was a start in that direction.

2 Likes

I’m more interested in seeing if the potential AI is like Skynet…

1 Like

Apart from one, the list of questions I submitted was supposed to work in couplets. There is one question most humans would easily pass, and another, very similar one, that was more morally ambiguous.

In submitting the questions to a computer, you’ve removed half of my questions.

But a crucial part of a successful Turing test is not just trying to identify the computer. Half of the task is to identify the humans. Once you’re fairly sure who the humans are, what’s left is probably computer.

@Polyphemus I understood the dual question format. Miksuku was able to reasonably answer the ones I left out.

How many questions should there be in the test? Is there a rule or format for that?

Suggestions have been between 5 and 10, but some have sent in more.

1 Like

There is a long running bet on a computer passing the Turing test http://www.kurzweilai.net/a-wager-on-the-turing-test-the-rules. In that protocol, the examination takes two hours. There is no limit to the questions in that period.

WT have given no indication of what they consider a reasonable number of questions.

1 Like

@Polyphemus For the mentally challenged (ME) if we were actually testing an AI with a Turing Test it would essentially be Ockham’s Razor?

Ockam’s Razor is a philosophy of the “line of least resistance”. “Entities should not be multiplied unnecessarily” - i.e. the answer that requires the fewest assumptions is probably the correct one.

But in this case, we’re dealing with some very subtle distinctions. Facts and logic are things computers are very good at.

Emotions and human dilemmas, however, are not necessarily logical or factual. And I think that’s the way to go at this point. Consider how an average group of people would feel about a situation. Not clever people, or well-read people. Just ordinary, quirky, idiosyncratic people.

I agree. By Ockham’s Razor I kind of meant what I was seeing Miksuku do. She mostly didn’t answer the second part of a 2 part question or side stepped or ignored “emotion” questions. If compared to a human’s answers addressing “emotion” questions and handling 2 part or dual questions would make it possible to identify the AI and the human.

I agree, an ETARC’s test sounds cool. i will not send a personal one (too tired, 2 AM here) and i think we can use some in the list.

I’ll think up a few more questions.

You can run them by Miksuku at http://www.mitsuku.com/

I have been chatting with Miksuku, I find that if I ‘converse’ under the assumption of talking to a real person, and do not try to ‘trip up’ the AI, the AI becomes much easier to spot. The AI handles single questions quite well but small talk soon fails.

If the AI we are testing is Loop16 (Emily) that line gets blurred. She was pretty convincing.

@Polyphemus We are running out of time.

I think it’s too late, correct me if I’m wrong, but, it said 20:00 and I think EST it’s past 20:00? Wish we could have done it though, it was a cool idea.

20:30 by the reddit post