How do 20 questions AI algorithms work?

Algorithm Problem Overview

Simple online games of 20 questions powered by an eerily accurate AI.

How do they guess so well?

Algorithm Solutions

Solution 1 - Algorithm

You can think of it as the Binary Search Algorithm. In each iteration, we ask a question, which should eliminate roughly half of the possible word choices. If there are total of N words, then we can expect to get an answer after log2(N) questions.

With 20 question, we should optimally be able to find a word among 2^20 = 1 million words.

One easy way to eliminate outliers (wrong answers) would be to probably use something like RANSAC. This would mean, instead of taking into account all questions which have been answered, you randomly pick a smaller subset, which is enough to give you a single answer. Now you repeat that a few times with different random subset of questions, till you see that most of the time, you are getting the same result. you then know you have the right answer.

Of course this is just one way of many ways of solving this problem.

Solution 2 - Algorithm

I recommend reading about the game here: http://en.wikipedia.org/wiki/Twenty_Questions

In particular the Computers section:

> The game suggests that the information > (as measured by Shannon's entropy > statistic) required to identify an > arbitrary object is about 20 bits. The > game is often used as an example when > teaching people about information > theory. Mathematically, if each > question is structured to eliminate > half the objects, 20 questions will > allow the questioner to distinguish > between 2²⁰ or 1,048,576 subjects. > Accordingly, the most effective > strategy for Twenty Questions is to > ask questions that will split the > field of remaining possibilities > roughly in half each time. The process > is analogous to a binary search > algorithm in computer science.

Solution 3 - Algorithm

A decision tree supports this kind of application directly. Decision trees are commonly used in artificial intelligence.

A decision tree is a binary tree that asks "the best" question at each branch to distinguish between the collections represented by its left and right children. The best question is determined by some learning algorithm that the creators of the 20 questions application use to build the tree. Then, as other posters point out, a tree 20 levels deep gives you a million things.

A simple way to define "the best" question at each point is to look for a property that most evenly divides the collection into half. That way when you get a yes/no answer to that question, you get rid of about half of the collection at each step. This way you can approximate binary search.

Wikipedia gives a more complete example:

<http://en.wikipedia.org/wiki/Decision_tree_learning>

And some general background:

<http://en.wikipedia.org/wiki/Decision_tree>

Solution 4 - Algorithm

It bills itself as "the neural net on the internet", and therein lies the key. It likely stores the question/answer probabilities in a spare matrix. Using those probabilities, it's able to use a decision tree algorithm to deduce which question to ask that would best narrow down the next question. Once it narrows the number of possible answers to a few dozen, or if it's reached 20 questions already, then it starts reading off the most likely.

The really intriguing aspect of 20q.net is that unlike most decision tree and neural network algorithms I'm aware of, 20q supports a sparse matrix and incremental updates.

Edit: Turns out the answer's been on the net this whole time. Robin Burgener, the inventor, described his algorithm in detail in his 2005 patent filing.

Solution 5 - Algorithm

It is using a learning algorithm.

k-NN is a good example of one of these.

Wikipedia: k-Nearest Neighbor Algorithm

Content Type	Original Author	Original Content on Stackoverflow
Question	Daddy Warbox	View Question on Stackoverflow
Solution 1 - Algorithm	Yogi	View Answer on Stackoverflow
Solution 2 - Algorithm	cgp	View Answer on Stackoverflow
Solution 3 - Algorithm	Nathan Shively-Sanders	View Answer on Stackoverflow
Solution 4 - Algorithm	Cerin	View Answer on Stackoverflow
Solution 5 - Algorithm	Shaun Mason	View Answer on Stackoverflow