## Implementing Rubik's Cube: An Exercise in Hybrid Programming

by Jim Weigang

### Abstract

This paper discusses the implementation of Rubik's Cube in a hybrid programming environment using APL and assembly language. First, an implementation is developed using multidimensional APL programming techniques. This version is appealing because the full range of possible twists can be implemented with a short program. Next, an effort is made to speed up the program, and a simpler table-driven algorithm is devised. The table is not entered by hand, but is generated by the original APL program. A prototype of the table-driven algorithm is written in APL, then translated to assembly language. The resulting program runs up to 72 times faster than the multidimensional APL program. Combined with an APL main routine, the result is an easily-constructed, mostly-APL application that runs fast enough for practical use.

### The Cube

Rubik's Cube  consists of a 3-by-3-by-3 cubic arrangement of smaller cubes (or "cubies"). The sides (or "faces") of the cubies are colored, with a different color on each outward-pointing face. Each 3-by-3 matrix (or "slab") of cubies can be independently rotated (or "twisted") on an axis perpendicular to the slab and passing through the center of the slab.

To discuss twisting the cube, it is necessary to adopt a coordinate system and a notation for the various twists. The conventions used in this paper are taken from a previous paper , and are summarized below:

(still need to scan that drawing...)

Ninety-degree twists in the direction of the arrows are denoted by the indicated upper-case letter; twists in the opposite direction are denoted by the corresponding lower-caset (or underscored) letter.

### External Design

In a computer implementation, the cube must be displayed two-dimensionally. One simple way of doing this is by unwrapping the cube as follows:

```       +------+
|      |
| top  |
|      |
+------+------+------+------+
|      |      |      |      |
| left |front |right | back |
|      |      |      |      |
+------+------+------+------+
|      |
|bottom|
|      |
+------+
```

For black-and-white display, a coding scheme must be adopted to represent the colors. In this paper the colors blue, green, orange, red, white, and yellow are each denoted by the first letter of their name.

One can imagine a program that displays a cube in this fashion, then accepts a sequence of twist commands from the user, performs the twists, redisplays the cube, and repeats. For instance, the program might be used like this:

```      PLAY      -- user starts the program

GGG
GGG         -- program displays the
GGG            initial cube
OOO BBB WWW RRR
OOO BBB WWW RRR
OOO BBB WWW RRR
YYY
YYY
YYY

COMMAND?  Fb    -- user enters a
twist command
OOO
GGG         -- program twists cube,
WWW            then redisplays
YOG BBB YWG RRR
YOG BBB YWG RRR
YOG BBB YWG RRR
OOO
YYY
WWW

COMMAND?
```

There are many different ways of structuring PLAY, and the "teletype" approach used above is only one simple possiblity. But however PLAY is written, certain subroutines will be useful. For instance, it will be convenient to have a function which takes a cube and a list of twist codes, performs the twists, and returns the twisted cube:

```   twisted_cube {<-} cube AFTER twist_codes
```

Also useful will be a function that produces the display representation of a cube:

```   character_matrix {<-} SHOW cube
```

And finally (or rather, first of all), a function is needed to generate an initial, "solved" cube:

```   solved_cube {<-} NEWCUBE
```

The syntax of these functions leads to neat, easily-typed expressions when variables are used to hold twist sequences, as in:

```      SNAP{<-}'fMFm'
LIFT{<-}'CDcd'
SHOW NEWCUBE AFTER SNAP,LIFT,SNAP
GGG
GGG
GWG
OOO BBB WWW RRR
OOW RBY BWO BRR
ORO BWB WYW RBR
YYY
YYO
YGY
```

### Internal Data Representation

The display form of the cube need not be the same as the form used internally to represent the cube. The display form is chosen for visual appeal; the internal representation should be chosen to make the programs easy to implement.

Think of the physical cube itself--it is clearly a 3-by-3-by-3 array of cubies. Each of the cubies has three or fewer colored faces, each of which points in a different direction. An individual cubie can thus be represented by a three-element vector giving the colors of the faces pointing in the X, Y, and Z directions. Some fill value can be used to represent uncolored (inward-pointing) faces.

The cube as a whole might be represented by a 3-by-3-by-3 nested array of 3-element vectors, or more simply by a four-dimensional non-nested array with shape 3 3 3 3 and dimensions [Z;Y;X;face_direction]. (The ordering of the dimensions is arbitrary; this order simplifies SHOW.) Let the levels of the face_direction dimension be ordered X, Y, Z.

[Four-dimensional arrays may intimidate some readers, but shouldn't. Viewing two- or three-element vectors as geometrical objects ("pointing vectors") makes it difficult to think about four- or five- element vectors. Instead, a vector is viewed as a list of elements. Similarly, viewing arrays as spatial arrangements makes it hard to think about four- or five-dimensional arrays. An N-dimensional array should be viewed instead as a list of elements classified N ways (by N "grouping variables"). In the example at hand, the four-dimensional array representing the cube is a list of face colors classified four ways:

```               Face     |  Face
Z  Y  X  Direction  |  Color
---------------------+-----------
1  1  1     X       |  Orange
1  1  1     Y       |  Red
1  1  1     Z       |  Green
1  1  2     X       |  (hidden)
1  1  2     Y       |  Red
1  1  2     Z       |  Green
1  1  3     X       |  White
1  1  3     Y       |  Red
1  1  3     Z       |  Green
1  2  1     X       |  Orange
.  .  .     .       |    .
.  .  .     .       |    .
.  .  .     .       |    .
3  3  3     Z       |  Yellow
```

For any array A, the grouping variables are given by the expression 1+{transpose}({rho}A){represent}{neg}1+{iota}{times}/{rho}A.]

An individual side of the cube can be referenced or set using indexing. Let C be a shape 3 3 3 3 array with dimensions [Z;Y;X;face_direction] representing a cube. In the grouping-variable model, indexing C with scalars selects rows meeting some criterion and removes the columns corresponding to the indexed dimensions from the result. For example, C[;3;;2] selects elements of C having Y=3 and face_direction=2 (Y). The result is a shape 3 3 array with dimensions [Z;X] holding the front side of the cube. The back side is given by C[;1;;2]. The left and right sides, with dimensions [Z;Y], are given by C[;;1;1] and C[;;3;1]; the top and bottom sides, with dimensions [Y;X], by C[1;;;3] and C[3;;;3].

The NEWCUBE function can now be defined as:

```     {del}Z{<-}NEWCUBE
   @Returns a "solved" cube
   Z{<-}3 3 3 3{rho}' ' @ [X;Y;Z;face_direction]
   Z[;;1;1]{<-}'O'
   Z[;;3;1]{<-}'W'   @ set each side
   Z[;1;;2]{<-}'R'
   Z[;3;;2]{<-}'B'
   Z[1;;;3]{<-}'G'
   Z[3;;;3]{<-}'Y'
{del}
```

The SHOW function need do little more than index the sides and join them together, but care must be taken to orient each side correctly. For instance, the right side of the cube as given by C[;;3;1] is oriented with the origin in the upper left corner. For display, each row must be reversed to put the origin in the upper right corner.

```     {del}Z{<-}SHOW C;B
   @Returns display form of cube
   B{<-}3 3{rho}' ' @ 3-by-3 matrix of blanks
   Z{<-}B,' ',C[1;;;3],' ',B,' ',B
   Z{<-}Z,C[;;1;1],' ',C[;3;;2],' ',({reverse}C[;;3;1]),' ',({reverse}C[;1;;2])
   Z{<-}Z, B,' ',({reversebar}C[3;;;3]),' ',B,' ',B
{del}
```

### Twisting the Cube

Each twist involves rotating a slab of cubies 90 degrees. If the slab is represented by a 3-by-3 nested matrix, the rotation can be accomplished with {reverse}{transpose} or {transpose}{reversebar} (clockwise rotation), and with {transpose}{reverse} or {reversebar}{transpose} (counterclockwise rotation). A third form can be used, {transpose}{reverse}[I], where I=1 selects clockwise and I=2 counterclockwise rotation. This last form makes it clear that a matrix is rotated 90 degrees by reversing along one of the dimensions, then transposing to exchange the dimensions.

In the non-nested representation, a slab has an extra dimension for the face_direction axis. For example, the right slab C[;;3;] is represented by a 3-by-3-by-3 array with dimensions [Z;Y;face_direction]. Such an array can be reversed along the Z or Y axes using {reverse}[I] in the same manner as a matrix. The Z and Y axes in a three-dimensional slab S are exchanged with 2 1 3{transpose}S.

[Readers unfamiliar with dyadic transpose should know that it reorders the dimensions of an array. The elements of the left argument correspond positionally with the dimensions of the right argument (think of the left argument as being written above the dimensions). The values in the left argument tell where to put the corresponding dimension in the result.

```        .----------------------------------.
.---|-------------------------------.  |
| .-|-----------------------------. |  |
| | |                             | |  |
V V V                             2 1  3
[Y;Z;face] {<-} 2 1 3 {transpose} [Z;Y;face]
1 2 3```

In the grouping-variable model of an array, dyadic transpose rearranges the order of the columns, then sorts the rows to keep the grouping variables in odometer order.]

Rotating a slab 90 degrees changes the direction in which some of the cubie faces point. An R- or r-twist turns faces pointing in the Y direction to point in the Z direction, and vice versa. The Y and Z face directions in the right slab of the cube are exchanged with C[;;3;]{<-}C[;;3;1 3 2]. Generally, after any twist, it is necessary to exchange the two face directions not being rotated around.

Putting these operations together, cube C is R-twisted with:

```   C[;;3;]{<-} 2 1 3{transpose} {reverse} C[;;3;1 3 2]
```

Because of the syntactic (as opposed to functional) nature of bracket-semicolon indexing, three statements are required to generate all possible twists, one for each axis:

```   X: C[;;J;]{<-} 2 1 3{transpose} {reverse}[K] C[;;J;1 3 2]
Y: C[;J;;]{<-} 2 1 3{transpose} {reverse}[K] C[;J;;3 2 1]
Z: C[J;;;]{<-} 2 1 3{transpose} {reverse}[K] C[J;;;2 1 3]
```

Parameter J is either 1, 2, or 3, and selects the layer along the axis to rotate. Parameter K is either 1 or 2, and selects clockwise or counterclockwise rotation.

The AFTER function can now be written as follows:

```     CODES{<-}'LlCcRrbBhHfFUuMmDd'

{del}Z{<-}C AFTER S;N;I;J;K
   @Applies twist sequence S to cube C
   S{<-}CODES{iota},S          @ convert to integers
   S{<-}(S{<=}18)/S            @ ignore invalid codes
   N{<-}1+3 3 2{represent}S-1  @ axis{commabar}layer{commabar}direction
   I{<-}0 & Z{<-}C
  L0:{->}(({rho}S)<I{<-}I+1)/0 @ For each twist,
   J{<-}N[2;I] & K{<-}N[3;I]
   {->}(L1,L2,L3)[N[1;I]]
  L1:Z[;;J;]{<-}2 1 3{transpose}{reverse}[K]Z[;;J; 1 3 2]
  {->}L0                      @   X-rotation
 L2:Z[;J;;]{<-}2 1 3{transpose}{reverse}[K]Z[;J;; 3 2 1]
  {->}L0                      @   Y-rotation
 L3:Z[J;;;]{<-}2 1 3{transpose}{reverse}[K]Z[J;;; 2 1 3]
  {->}L0                      @   Z-rotation
{del}
```

Global variable CODES is the ravel of the shape 3 3 2 array with dimensions [axis;layer;direction] holding the character code for each possible twist. The array indices of a character code (computed on line  of AFTER) give the axis, layer, and direction parameters of the corresponding twist.

### Putting It All Together

The PLAY function is now simple to write:

```     {del}PLAY;S
   @Twists cube as directed by user
   {execute}(0=#NC'CUBE')/'CUBE{<-}NEWCUBE'
   #{<-}SHOW CUBE          @ display cube
  L1:{quotequad}{<-}S{<-}'COMMAND?  '         @ prompt user
   {execute}(','=1{take}S)/'S{<-}{execute}S'  @ , prefixes exprs
   {execute}('*'{epsilon}S)/'CUBE{<-}NEWCUBE' @ * resets
   {->}(0={rho}S)/0        @ null line exits
   CUBE{<-}CUBE AFTER S    @ twist the cube
  #{<-}SHOW CUBE          @ redisplay
  {->}L1                  @ repeat
{del}
```

[Lines  and  implement a couple of handy features. A command that begins with a comma causes is evaluated as an APL expression, allowing the use of twist sequences stored in variables (e.g., ",SNAP,LIFT,SNAP"). If a twist sequence includes an asterisk, the cube is restored to "solved" order before applying the twists.]

Readers with access to an APL system are encouraged to type in the programs and try them out. If you are using a microcomputer, you may notice that for long twist sequences, PLAY leaves something to be desired in the way of execution speed. Using STSC's APL*PLUS  PC system on an IBM PC (a 4.77 MHz 8088!), it takes about 15 seconds to perform 100 twists. (Admittedly, that's a lot of twists, but you might use that many to scramble the cube.)

### Efficiency

Using a monitoring facility that reports the amount of time spent on each line of a function, it was determined that AFTER and SHOW were the CPU bottlenecks. When using PLAY to perform 1, 18, and 100 twists per input, AFTER and SHOW together consume 39, 77, and 95% of the CPU time in the loop. (These percentages would be even greater if the time to display the results were excluded.)

Can AFTER and SHOW be rewritten to run faster? SHOW could be sped up (eliminated, in fact) by making the internal form equal to the display form. AFTER can be written to deal with the display form, since the data rearrangement going from D to E in

```      D{<-}SHOW NEWCUBE
E{<-}SHOW NEWCUBE AFTER 'R'
```

can also be carried out directly on the display form D with

```      T{<-},D
T[TO]{<-}T[FROM]
E{<-}({rho}D){rho}T
```

where FROM is the indices of cubie faces that move, and TO is the locations the faces move to. This should be a more efficient way of twisting the cube, since (excluding the ravel and reshape) it processes only those faces that move and uses very simple operations.

The FROM and TO vectors could be entered by hand, but this would be a painful and error-prone process. Fortunately, there is an easier way. With a little cleverness, the multidimensional version of AFTER can be used to automatically generate the required vectors. The trick is to use a dummy cube of distinct values so that individual faces can be tracked.

```   C{<-}3 3 3 3{rho}(#AV{/=}' ')/#AV @ dummy cube
B{<-},SHOW C                      @ initial display
A{<-},SHOW C AFTER 'R'            @ display after twist
FROM{<-}(A{/=}B)/{iota}{rho}B     @ indices of moved faces
TO{<-}A{iota}B[FROM]              @ where faces end up
```

Twisting an edge slab moves 20 cubie faces, while twisting a center slab moves only 12 faces. In forming an array of FROM and TO vectors, it would be convenient for all the vectors to have the same length. This can be done by repeating the last element of each vector as fill. Using this technique, the FROM and TO vectors for all 18 possible twists can be stored in a shape 18 2 20 array with dimensions [twist;from_to;face]. The following function generates such an array:

```     {del}GENAFTRT;V;C;B;K;A;F;Z
   @Generates face relocation table
   B{<-},SHOW C{<-}3 3 3 3{rho}(#AV{/=}' ')/#AV
   Z{<-}18 2 20{rho}K{<-}0
  L1:{->}(18<K{<-}K+1)/L2         @ For each twist,
   A{<-},SHOW C AFTER CODES[K]
   F{<-}(A{/=}B)/{iota}{rho}B           @ FROM-vector
   F{<-}F,(20-{rho}F){rho}{neg}1{take}F @ pad to 20 elems
   Z[K;1;]{<-}F
   Z[K;2;]{<-}A{iota}B[F]               @ TO-vector
  {->}L1
 L2:AFTRT{<-}Z                   @ set global result
{del}
```

Adopting the display form as the internal form (which eliminates SHOW), NEWCUBE and AFTER can be rewritten as follows:

```     NEWCUBE2{<-}SHOW NEWCUBE

{del}Z{<-}C AFTER2 S;N;K;FROM;TO
   @Applies twists S to cube C
   N{<-}CODES{iota},S & N{<-}(N{<=}18)/N
   K{<-}0 & Z{<-},C
  L1:{->}(({rho}N)<K{<-}K+1)/L2 @ For each twist,
   FROM{<-}AFTRT[N[K];1;]
   TO{<-}AFTRT[N[K];2;]
   Z[TO]{<-}Z[FROM]             @    move faces
   {->}L1
{del}
```

Global variable AFTRT is generated by the GENAFTRT function.

When benchmarked on an IBM PC, AFTER2 is faster than AFTER, but not by much (1.4 to 1.9 times faster). Why isn't the speedup greater? The reason is that in the loops of both programs the average array size is only 7.2 elements. When processing small arrays, APL spends more time interpreting the program and dynamically allocating memory than it spends actually processing elements. On most APL systems, the overhead for each primitive operation is on the order of 500 machine instructions. Because of this, when processing small arrays, execution time is determined more by the number of primitive operations, not by the complexity of the operations. AFTER uses 17 operations per iteration; AFTER2 uses 12. The operation ratio of 1.4 is close to the observed speedup ratio.

### Compiling AFTER

The overhead of interpretation and memory management can be eliminated by writing AFTER2 in a compiled language. Compiled programs are translated once to machine instructions and thereafter run without the need for interpretation. Memory is allocated using explicit instructions which can be optimized by the programmer. On the APL*PLUS PC system, the most accessible compiled language is assembly language.

Writing an assembly language program is greatly facilitated by writing an APL prototype beforehand. A properly constructed prototype can weed out many bugs and is easy to translate to assembly language. Finding bugs in the prototype stage is important because when an assembly language program runs amok, it may crash the system; APL programs never crash the system. An assembler prototype should use only operations which correspond to machine instructions, and should run with #IO set to zero. Native machine instructions include assignment, conditional and unconditional branching, integer arithmetic (+ - {signum} {divide}), and indexing. Arrays of any rank are indexed as vectors (i.e., with a single subscript). For more details, see reference . The program below implements the same algorithm as AFTER2, but using only machine-level operations.

```     {del}Z{<-}C AFTER3P S;#IO;RT;TMP;K;N;T;BI;I;BJ
   @Assembler prototype; twists on cube
   @ C as directed by twist codes S
   #IO{<-}0                            @ run in index origin zero
   S{<-},S & RT{<-},AFTRT              @ treat as vectors
   Z{<-},C & TMP{<-}20{rho}' '         @ allocate memory
   K{<-}{neg}1 & N{<-}{rho}S & {->}L6  @ For each twist,
   @
   @ Find FROM base for this twist
  L1:T{<-}#AV{iota}S[K] & BI{<-}AFTDI[T]
  {->}(BI<0)/L6                       @ skip invalid codes
  @
  @ Copy faces that move to TMP
  I{<-}20 & {->}L3
 L2:T{<-}RT[BI+I]-1 & TMP[I]{<-}Z[T]
 L3:I{<-}I-1 & {->}(I{>=}0)/L2
  @
  @ Copy TMP to result
  BJ{<-}BI+20                         @ base of TO vector
  I{<-}20 & {->}L5
 L4:T{<-}RT[BJ+I]-1 & Z[T]{<-}TMP[I]
 L5:I{<-}I-1 & {->}(I{>=}0)/L4
  @
  @ Go on to the next twist code
 L6:K{<-}K+1 & {->}(K<N)/L1
  Z{<-}({rho}C){rho}Z                 @ result is really a matrix
{del}
```

Global variable AFTDI is a constant obtained by:

```   AFTDI{<-}256{rho}{neg}1
AFTDI[#AV{iota}CODES]{<-}40{times}({iota}18)-#IO
```

Indexing through AFTDI takes the place of dyadic iota ({iota}) in AFTER2. The AFTDI element corresponding to 'L' (AFTDI[#AV{iota}'L']) gives the starting index in AFTRT of the FROM vector for an L- transformation.

The program above was translated to 8088 assembly language for the APL*PLUS PC system. Below is a listing of the APL cover function and object code (which is a numeric representation of the machine language program). The source code can be found in reference .

```     {del}Z{<-}C AFTER3 S;T
   @Assembler version
   T{<-}(#STPTR'Z C S AFTRT AFTDI')#CALL AFTER{delta}OBJ
{del}

AFTER{delta}OBJ
8939 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
{neg}30418 4110 11776 5769 18 {neg}30418 5166
11776 7817 22 {neg}29650 3102 {neg}29184 9735
3723 8 {neg}29914 2582 {neg}29952 9944 1697
{neg}13056 11912 1676 14 {neg}29906 5662 {neg}29184
{neg}29921 {neg}3081 {neg}31836 3055 {neg}30418 6206
11776 7822 12 {neg}29906 5150 {neg}29184 11783
1676 20 {neg}29906 4638 {neg}29184 11783 1676 18
{neg}29906 4126 {neg}29184 11783 1676 16 {neg}24538 7
572 882 30955 15504 29696 {neg}18166 {neg}1
{neg}29914 2070 {neg}5376 {neg}18170 {neg}3 {neg}70 {neg}5121
11867 1678 16 {neg}5749 {neg}9421 {neg}30170 2654
{neg}29138 5126 768 9947 28555 {neg}31990 253
15996 {neg}4861 {neg}29138 4614 11776 7822 14
5051 {neg}32000 3781 {neg}29914 118 17733 17546
11787 {neg}30840 26 32075 {neg}17425 19 {neg}29914
118 15150 6198 29440 17683 11845 {neg}30838
26 17544 19211 {neg}6019 15169 31946 {neg}13408
205 8510 1301 1727

({rho}AFTER{delta}OBJ),+/AFTER{delta}OBJ
142 {neg}322747     @ length and checksum
```

This program should work on IBM PCs and most compatible computers, but will, of course, run only on the 8088-based APL*PLUS PC system. (This disadvantage could be rectified somewhat by writing the program in a higher-level compiled language.) After typing in the object code (being sure to verify the length and checksum), save the workspace before attempting to run the program. A typo in the object code, incorrect variable, or incompatibility may cause the system to crash.

Execution times for the three versions of AFTER are summarized below (times are in CPU seconds on an IBM PC running APL*PLUS PC version 4.1 with 280K workspace available).

```  Version |     Number of Twists     Twists
of |                            Per
AFTER |     1      18      100   Second
----------+---------------------------------
|
array APL |  .195   2.275   12.305       8
|
table APL |  .135   1.375    7.360      14
|
table ASM |  .059    .079     .169     900
|
```

The last column is the marginal number of twists per second given by (N-1){divide}(TN-T1), where T1 is the time to perform 1 twist and TN the time to perform N twists. When performing sequences of 1, 18, and 100 twists, the AFTER3 function is 3.3, 29, and 72 times faster than AFTER. Counting all the time in the loop (including time to display the result), a version of PLAY that uses AFTER3 is 1.3, 3, and 13 times faster than one that uses AFTER.

### Hybrid Programming

Let's recap the steps taken to create the hybrid application. First, it was implemented in APL, taking full advantage of the language's powerful features. To speed up the program, an execution time monitoring facility was used to find the slowest routines, and those programs were optimized. More efficient algorithms were developed, prototyped in APL, and finally implemented in a compiled language.

Hybrid programming would be impossible without features built into the APL system that allow machine language programs to be executed. A mechanism such as #CALL on the APL*PLUS PC system is needed to be able to execute user-specified machine code, passing arguments and getting back results. The internal format of variables must be documented. There must be a way to locate where an array starts in memory, create new arrays, and signal errors. To find CPU bottlenecks, a monitoring facility is required; it is best if this is built into the APL system. The effort required to implement and document these features is labor well invested.

Largely because of security considerations, these features are often omitted from mainframe APL systems (or provided in a different form; see note ).

Hybrid programming environments make programmers more productive, expand the range of practically solvable problems, and improve the chances that application developers will choose APL as an implementation language. Hybrid environments allow programmers to take advantage of the capabilities of more than one language. APL is useful for writing programs fast; assembler is useful for writing programs that run fast. By exploiting the strengths of more than one language, a programmer can minimize the effort required to create a fast, flexible application. The final result can retain the advantages of the interpreted APL environment while enjoying the execution speed of compiled programs.

Hybrid programming environments can minimize the programmer effort required to create an acceptably fast application. The noncritical portions of the application can be written in a language that makes efficient use of the programmer's time, and extra time can be invested in writing only the CPU bottlenecks in a lower level language.

A hybrid application can be more flexible and easier to extend than a comparable application written entirely in a low-level language. This is especially important because so many applications start off as ill-defined problems and become concrete only when a working prototype is available.

Further developments in hybrid environments may enable APL users to access libraries of software written in other languages, saving enormous amounts of programming time. At present, writing an APL application often involves rewriting tools readily available in other languages. Hybrid programming may insure the future viability of APL by making it possible to enjoy the luxuries of APL without having to sacrifice the superior capabilities of machine-oriented languages.

Finally, hybrid environments can make high-level languages practical. Without the ability to write subroutines in a low-level language, using a high-level language may mean sacrificing performance and fundamental machine capabilities. Hybrid environments may insure the future viability of APL by making it possible to enjoy the luxuries of APL without having to sacrifice the superior capabilities of machine-oriented languages.

### Notes and References

 "Rubik's Cube" is a trademark of Ideal Toy Corporation.

 Peelle, H. A. "Representing Rubik's Cube In APL", Proc. APL 84 Conference, Helsinki, Finland, (June 1984), 255-262.

 "APL*PLUS" is a trademark of STSC, Inc.

 Weigang, J. "Hybrid Programming and Rubik's Cube", Computer Language, 3:1 (January 1986), 27-36.

 On some APL systems "auxiliary processors" (AP's) serve in place of #CALL. However, as typically implemented, AP's have many disadvantages which impede the construction of hybrid applications, including:

• No dynamic startup or shutdown. AP's must be started on entry to APL and the memory they use cannot be reclaimed without leaving APL. Before you enter APL, you must think of every AP you might want to use.

• Communication with an AP is through a shared variable buffer whose size is fixed on entry to APL. Before you enter APL, you must decide the size of the largest array you might need to pass to an AP.

• Establishing a share is a time- consuming operation; it is undesirable to have to share a variable each time a function is called.

• A limited number of variables may be shared simultaneously.

• Using shared variables to implement polyadic functions is clumsy, requiring the AP to be in different states to accept the different arguments.

 Weigang, J. "An Introduction To STSC's APL Compiler", Proc. APL 85 Conference, Seattle, Washington, (May 1985), 231-238.

 Wiedmann, C. "Efficiency In the APL Environment", Proc. APL 85 Conference, Seattle, Washington, (May 1985), 77-85.

### Problems

1. Write a function that returns N random twist codes. For example, RANDOM 5 returns a five-element vector, and

```       C AFTER RANDOM 5
```

applies five random twists to a cube.

2. Write a function that computes the inverse of a twist sequence. For any twist sequence S and cube C,

```       C AFTER S,INVERSE S
```

should be equivalent to C.

3. Write a screen-oriented version of PLAY. Retain all the twist codes entered by the user and provide a facility for stepping back and forth through previous moves.

4. Implement an alternate way of displaying the cube that reveals the cubie structure:

```               G
G G  G G G  G G  G
RO O OB B BW W WR R
RO O OB B BW W WR R
RO O OB B BW W WR R
Y Y  Y Y Y  Y Y  Y
Y
```

Such a display makes it easier to identify particular cubies (they now have distinct names such as GOR, GO, or G) and see how they are oriented. The AFTER3 function can be used with this format by changing variable AFTRT to reflect the new face positions.

5. As it is written, the AFTER3 function first copies all faces that move to a temporary array, then copies them to the result. Figure out how to improve the program so that it picks up the first face, picks up what is at the destination of the first face, puts the first face where it belongs, and continues, using only two registers for temporary storage. Write an APL prototype and time it against AFTER3P to get an idea of how much faster the assembler version would run.

6. Write a program SOLVE that, given an arbitrary scrambled cube, returns a twist sequence that restores the cube to solved form. Does there seem to be a limit to

```       {rho}SOLVE NEWCUBE AFTER RANDOM N
```

for large values of N?

The author would like to express his gratitude to Kit Bigler, Murray Eisenberg, Phil Evans, and H. A. Peelle for assistance in preparing this article.