Mathematics of Sudoku

The general problem of solving Sudoku puzzles on n2 x n2 boards of n x n blocks is known to be NP-complete. This gives some indication of why Sudoku is difficult to solve, although on boards of finite size the problem is finite and can be solved by a deterministic finite automaton that knows the entire game tree.

Solving Sudoku puzzles (as well as any other NP-hard problem) can be expressed as a graph colouring problem. The aim of the puzzle in its standard form is to construct a proper 9-colouring of a particular graph, given a partial 9-colouring. The graph in question has 81 vertices, one vertex for each cell of the grid. The vertices can be labelled with the ordered pairs , where x and y are integers between 1 and 9.

The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.

A valid Sudoku solution grid is also a Latin square. There are significantly fewer valid Sudoku solution grids than Latin squares because Sudoku imposes the additional regional constraint. Nonetheless, the number of valid Sudoku solution grids for the standard 9×9 grid was calculated by Bertram Felgenhauer in 2005 to be 6,670,903,752,021,072,936,960 (sequence A107739 in OEIS). This number is equal to 9! × 722 × 27 × 27,704,267,971, the last factor of which is prime. The result was derived through logic and brute force computation. The derivation of this result was considerably simplified by analysis provided by Frazer Jarvis and the figure has been confirmed independently by Ed Russell. Russell and Jarvis also showed that when symmetries were taken into account, there were 5,472,730,538 solutions (sequence A109741 in OEIS). The number of valid Sudoku solution grids for the 16×16 derivation is not known.

The maximum number of givens that can be provided while still not rendering the solution unique is four short of a full grid; if two instances of two numbers each are missing and the cells they are to occupy form the corners of an orthogonal rectangle, and exactly two of these cells are within one region, there are two ways the numbers can be assigned. Since this applies to Latin squares in general, most variants of Sudoku have the same maximum. The inverse problem—the fewest givens that render a solution unique—is unsolved, although the lowest number yet found for the standard variation without a symmetry constraint is 17, a number of which have been found by Japanese puzzle enthusiasts, and 18 with the givens in rotationally symmetric cells.

It is possible to set starting grids with more than one solution and to set grids with no solution, but such are not considered proper Sudoku puzzles; as in most other pure-logic puzzles, a unique solution is expected.

Building a Sudoku puzzle by hand can be performed efficiently by pre-determining the locations of the givens and assigning them values only as needed to make deductive progress. Such an undefined given can be assumed to not hold any particular value as long as it is given a different value before construction is completed; the solver will be able to make the same deductions stemming from such assumptions, as at that point the given is very much defined as something else. This technique gives the constructor greater control over the flow of puzzle solving, leading the solver along the same path the compiler used in building the puzzle. (This technique is adaptable to composing puzzles other than Sudoku as well.) Great caution is required, however, as failing to recognize where a number can be logically deduced at any point in construction-regardless of how tortuous that logic may be-can result in an unsolvable puzzle when defining a future given contradicts what has already been built. Building a Sudoku with symmetrical givens is a simple matter of placing the undefined givens in a symmetrical pattern to begin with.

It is commonly believed that Dell Number Place puzzles are computer-generated; they typically have over 30 givens placed in an apparently random scatter, some of which can possibly be deduced from other givens. They also have no authoring credits - that is, the name of the constructor is not printed with any puzzle. Wei-Hwa Huang claims that he was commissioned by Dell to write a Number Place puzzle generator in the winter of 2000; prior to that, he was told, the puzzles were hand-made. The puzzle generator was written with Visual C++, and although it had options to generate a more Japanese-style puzzle, with symmetry constraints and fewer numbers, Dell opted not to use those features, at least not until their recent publication of Sudoku-only magazines.

Nikoli Sudoku are hand-constructed, with the author being credited; the givens are always found in a symmetrical pattern. Dell Number Place Challenger puzzles also list authors . The Sudoku puzzles printed in most UK newspapers are apparently computer-generated but employ symmetrical givens; The Guardian licenses and publishes Nikoli-constructed Sudoku puzzles, though it does not include credits. The Guardian famously claimed that because they were hand-constructed, their puzzles would contain "imperceptible witticisms" that would be very unlikely in computer-generated Sudoku. The challenge to Sudoku programmers is teaching a program how to build clever puzzles, such that they may be indistinguishable from those constructed by humans; Wayne Gould required six years of tweaking his popular program before he believed he achieved that level.