Set Game Puzzler

This problem posed by Roger Hui on comp.lang.apl resulted in several solutions, most of them in the J language. The following articles are my contribution to the thread. Part of the solution involves the COMBIN function, a useful utility that generates all possible combinations of N objects taken M at a time.

Note: If you aren't an experienced APL programmer, you should look at the Set Game Tutorial first. It'll probably make more sense than these raw postings.

Set Enterprises, Inc., creators of the Set game, has a delightful Web site that includes a daily on-line Set game, a shareware version of the game, links to related sites, and fascinating information about the game and its history. Check it out!

From: Roger Hui <Roger.Hui@Sympatico.CA>
Newsgroups: comp.lang.apl
Date: Mon, 8 Apr 1996 05:56:49 GMT
Subject: Identifying Sets

The Game: Set is a game marketed by Set Enterprises Inc. In each round, 12 cards are dealt from a deck of 81 distinct cards. Each card is imprinted with one or more repetitions of a symbol with the following four features:

repetition: 1, 2, or 3
shape: diamond, oval, or squiggle
color: red, green, or blue

A "set" consists of three cards wherein each feature is either the same in all three cards or different in all three cards. For example, the following three cards form a set:

2 repetitions of a red shaded diamond
2 repetitions of a red solid squiggle
2 repetitions of a red outline oval

(Repetitions are the same; colors are the same; textures are different; shapes are different.)

The first player to identify a set wins the round.

The Problem: Write a set of programs to simulate the playing of a round: deal 12 cards and identify all the sets contained therein.

From: Jim Weigang <jimw@math.umass.edu>
Newsgroups: comp.lang.apl
Date: Wed, 10 Apr 1996 20:55:46 GMT
Subject: Identifying Sets

Roger Hui wrote on April 8:

The Game: Set is a game marketed by Set Enterprises Inc. In each round, 12 cards are dealt from a deck of 81 distinct cards. Each card is imprinted with one or more repetitions of a symbol with the following four features:

repetition: 1, 2, or 3
shape: diamond, oval, or squiggle
color: red, green, or blue

A "set" consists of three cards wherein each feature is either the same in all three cards or different in all three cards.

A card can be represented as a 4-by-3 bit matrix having a row for each feature and a column for each characteristic. Each row has exactly one bit turned on, indicating the characteristic for that feature. For display purposes, it's convenient to list the rows side by side:

```        1 0 0   0 1 0   0 0 1   0 1 0         - bits
1       oval    blue    outline       - labels
```

For three cards to be a valid "set", all the characteristics for a feature must be either identical or distinct. Here are the bits for three cards:

```        1 0 0   0 0 1   1 0 0   0 1 0         - card 1
0 0 1   0 0 1   1 0 0   0 0 1         - card 2
0 1 0   0 0 1   0 0 1   0 1 0         - card 3
OK      OK      No      No
```

The column sums for a 3-by-3 matrix shown above must be either all 1s, or 0s and 3 for the matrix to be OK. The maximum value of the sums suffices to identify a valid matrix: if it's 1 or 3, the matrix is OK. If all four matrices (features) are OK, the three cards are a set.

The coding is straightforward but involves multidimensional arrays, which can be intimidating if you don't know how to track the dimensions.

```     {del} Z{<-}HAND
[1]    @Deals 12 cards
[2]    Z{<-}{transpose}0 1 2{jot}.=3 3 3 3{represent}{neg}1+12?81
{del}
```

The result of HAND has dimensions [card;feature;characteristic]. (That is, it has a plane for each of the 12 cards, a row for each of the 4 features, and a column for each of the 3 characteristics.)

```     {del} Z{<-}SETS H;A
[1]    @Returns all the sets in hand {omega}
[2]    A{<-}H[3 COMBIN 1{take}{rho}H;;]
[3]    Z{<-}(^/({max}/+/[2]A){epsilon}1 3){slashbar}A
{del}
```

Variable A in SETS is all possible combinations of three cards "triples"). It has dimensions [combination;triple;feature;characteristic] and shape 220 3 4 3. Keeping these dimensions in mind allows you to understand the reductions and compression on line [3]. Using names instead of dimension indices, the statement could be written as:

```  ( ^/[feature] ({max}/[char.] +/[triple] Z) {epsilon} 1 3 ) /[comb.] Z
```

It may help to mentally execute the expression in the outer parentheses for the OK/No matrices above, in which the rows are [triple], the columns are [characteristic] and the four "hypercolumns" are [feature].

The result of SETS has the same dimensions as A, but the first element of the shape will be different.

```     {del} Z{<-}FMT A;C
[1]    @Formats cards for display
[2]    C{<-}'1' '2' '3' 'diamond' 'oval' 'squiggle' 'red' 'green' {+
[3]    Z{<-}({neg}1{drop}{rho}A){rho}(,A)/({rho},A){rho}C
{del}
```

Originally, I had the dimensions arranged differently, but when it came time to format the result, I rearranged them to simplify the formatting.

```     {del} R{<-}M COMBIN N;D;E;F;G;P
[1]    @All possible combinations of {alpha} items drawn from {iota}{+
+}{omega}
[2]    E{<-}({iota}P{<-}N-R{<-}M-1)-#IO
[3]    D{<-}R+{iota}P
[4]    R{<-}(P,1){rho}D
[5]    P{<-}P{rho}1
[6]   L1:{->}(#IO>1{take}D{<-}D-1)/0
[7]    P{<-}+\P
[8]    G{<-}+\{neg}1{drop}0,F{<-}{reverse}P
[9]    E{<-}F/E-G
[10]   R{<-}(F/D),R[E+{iota}{rho}E;]
[11]   E{<-}G
[12]   {->}L1
{del}
```

COMBIN is from Gary Bergquist's Zark APL Tutor Newsletter, 1995 Quarter 2. (It was included as part of the problem description for the Permutations puzzler, which was discussed in comp.lang.apl starting on June 5, 1995.) Although this may not be the fastest algorithm, the point is moot: If you want SETS to run fast, you should precompute 3 COMBIN 12 just once, store the result in a variable, and use that variable on line [2] of SETS.

Some examples:

```      #RL{<-}16807
{rho}H{<-}HAND
12 4 3
({enclose}[3]H) (FMT H)
0 0 1  0 1 0  0 0 1  0 1 0   3 oval     blue  outline
1 0 0  0 0 1  1 0 0  0 1 0   1 squiggle red   outline
0 1 0  0 1 0  0 0 1  1 0 0   2 oval     blue  solid
0 1 0  0 1 0  1 0 0  1 0 0   2 oval     red   solid
0 0 1  1 0 0  0 0 1  1 0 0   3 diamond  blue  solid
0 0 1  0 0 1  1 0 0  1 0 0   3 squiggle red   solid
1 0 0  0 0 1  0 0 1  1 0 0   1 squiggle blue  solid
1 0 0  0 0 1  0 1 0  0 0 1   1 squiggle green shaded
1 0 0  1 0 0  0 1 0  0 1 0   1 diamond  green outline
0 1 0  0 1 0  0 0 1  0 0 1   2 oval     blue  shaded
0 1 0  1 0 0  0 0 1  0 1 0   2 diamond  blue  outline
1 0 0  0 1 0  1 0 0  0 0 1   1 oval     red   shaded

FMT SETS H
1 squiggle red   outline
1 squiggle blue  solid

2 oval     blue  solid
3 diamond  blue  solid
1 squiggle blue  solid

3 squiggle red   solid
1 diamond  green outline

1 squiggle blue  solid
1 diamond  green outline

#RL{<-}12345
FMT SETS HAND
3 squiggle red   outline
2 diamond  green solid

3 squiggle red   solid
2 diamond  green solid
1 oval     blue  solid
```

Nice programming puzzle, but I'm not sure I'd want to play this game!

Jim

Postscript: After actually playing the game, I found that it's not nearly as agonizing as I thought it would be. My 7-year old is especially enthusiastic about the game.

From: Jim Weigang <jimw@chilton.com>
Newsgroups: comp.lang.apl
Date: Tue, 16 Apr 1996 03:32:49 GMT
Subject: Re: Identifying Sets

On April 11, Roger Hui obtained the following timings on his P90:

```        J: sets deal 0          0.0061 secs
J: Sett 12?81           0.0115
APL: SETS2 HAND         0.0109
```

(Where SETS2 is SETS modified to use a precomputed 3 COMBIN 12.) The Boolean card array used by the APL program slows down the indexing and +/[2] in SETS2 because of the extra work required to unpack and repack individual bits. A significant speedup can be obtained simply by converting the card array to integer type, which can be done by inserting "0+" to the right of {transpose} in the HAND function. Putting this in a HAND2 function and using my 386SX/16 to estimate the timing for Roger's P90 (56x faster!) yields:

```        APL: SETS2 HAND2        0.0075 secs
```

I also compared the speed of Bergquist's COMBIN function with nested and nonnested APL versions of Roger's comb function. It turns out that COMBIN is a very fast algorithm. One significant difference is that it avoids adding 1 repeatedly to the large result matrix. Instead, it stores the correct values in the result right away by initializing the short new-column vector (k in comb or D in COMBIN) to X+{iota}N and decrementing it on each loop iteration. A less appealing difference is that it shortcuts one iteration by building the first column of the result outside the loop. (But it introduces a couple of edge-condition bugs in doing so.) Here's a corrected and slightly faster variation of COMBIN:

```     {del} Z{<-}X COMB2 Y;C;J;K;N
[1]    @Returns all possible combinations of {alpha} items drawn from {+
+}{iota}{omega}
[2]    @ The result is origin-sensitive
[3]    N{<-}1+Y-X
[4]    K{<-}{divide}N  @ give domain error if X=Y+1
[5]    K{<-}(X-1)+{iota}N
[6]    Z{<-}(N,{signum}X){rho}K
[7]    R{<-}N{rho}1
[8]   L1:{->}(#IO>1{take}K{<-}K-1)/0
[9]    C{<-}{reverse}R{<-}+\R
[10]   J{<-}({iota}{rho}J)-J{<-}C/+\0,1{drop}C
[11]   Z{<-}(C/K),Z[J;]
[12]   {->}L1
{del}
```

The nested version of line [8] runs slower but is much less obtuse: J{<-}{enlist}(-C){take}{each}{enclose}{iota}1{take}{rho}Z.

Roger wrote:

Eugene's solution is more suitable for implementation in other APL dialects because it does not depend, as my solution does, on the availability of matrix e. matrix.

Although APL doesn't have e., several systems do have fastfns for character matrix lookup (e.g., MATIOTA or ROWFIND), and these can be used in place of e. if the card features are coded as characters instead of integers. Here's an APL version of Roger's program that uses this technique:

```{del}:  HANDS3{<-}'123'[1+{transpose}3 3 3 3{represent}{neg}1+{iota}81]

{del}:  COMB312{<-}3 COMB2 12

{del}:  UNIV3{<-}'123'[1{commabar}2{commabar}3{commabar}PERM 3]

{del} Z{<-}SETS3;A;H;M;T
[1]    @Test of Roger's algorithm in APL
[2]    H{<-}HANDS3[12?81;]     @ deal
[3]    A{<-}H[COMB312;]        @ [comb.;triple;feature]
[4]    T{<-}1 3 2{transpose}A  @ [comb.;feature;triple]
[5]    M{<-}880 3{rho}T        @ flatten to matrix [comb.,feature;triple]
[6]    M{<-}0{/=}UNIV3 MATIOTA M  @ like M e. UNIV3 in J
[7]    Z{<-}(^/220 4{rho}M){slashbar}A
{del}
```

This version uses the same precomputed arrays as Roger's program. While it runs faster than the J version, this is probably because the character arrays are a quarter the size of their integer counterparts.

```        APL: SETS3              0.0040 secs
```

Using general expressions in place of the constants on lines [5] and [7] makes the function uglier but does not significantly alter the execution time.

Jim

From: Jim Weigang <jimw@chilton.com>
Newsgroups: comp.lang.apl
Date: Sun, 21 Apr 1996 15:23:39 GMT
Subject: Re: Identifying Sets

Roger Hui wrote on April 19:

Regarding additional timings on the "set" programs provided by Jim Weigang on Tuesday, April 16. I entered the APL function SETS3 on my machine, using ROWFIND instead of MATIOTA (I can not find MATIOTA) and timed it.

ROWFIND in +III and MATIOTA are essentially identical. (The latest version of MATIOTA can be found in the APLASCII workspace as {delta_}MATIOTA.) After the first execution of ROWFIND there is no discernable difference in speed between the two functions, so Roger's SET3X timings are what the SET3 timings should be on his machine. My accurate estimate of the SETS2 HAND2 timing was sheer luck, the result of hardware and software differences cancelling out, something that did not happen with the inaccurate SET3 estimate. For what it's worth, here are a few more timings made back at home on my 486/66:

```                       +II/386      +III/Win

SETS2 HAND      0.0306        0.0235   secs
SETS2 HAND2     0.0200        0.0157
SETS3           0.0136        0.0135

MATIOTA         0.0066        0.0070
```

This benchmark once again highlights the difference between the older +II/386 and newer +III/Windows systems. Although both interpreters are based on nearly identical source code, +III is compiled using a Microsoft C compiler, which apparently produces better machine code than the MetaWare High-C compiler used for +II. The MATIOTA timing is for the lookup involved in SETS3. Since the MATIOTA machine code is identical on +II and +III, the slight performance difference is presumably overhead imposed by Windows.

Roger continued:

I also applied the idea to the J program, using characters instead of integers, to derive "setc", and timed _that_.

```J:   sets deal 0     0.0061   integer vector
J:   setc dealc 0    0.0064   character vector
```

I'm surprised that the character version takes longer to run even though the arrays are smaller.

Jim