Multipliers
by Nyles Heise



History of WireWorld Binary Multipliers

Brian Trial's

As far as I know, the first binary multiplier was designed by Brian Trial. I appeared on MathPuzzle on August 6, 2002. It is an 8-bit by 8-bit multiplier that takes about 18000 ticks to produce the product. Although quite large, it easily fits into an MCell file.

A breakthrough in WireWorld design!

Nick Gardner's

An improvement to Brian's design was also posted on MathPuzzle, only a few days later on August 18, 2002. (Nick must have had a few nights with very little sleep.) His design is also an 8-bit by 8-bit version with 16 specific flip-flops to hold the product. A full adder is included with each flip-flop. The cycle time of the logic is 15 ticks. So it takes about 15 n2 ticks to perform an n-bit by n-bit multiply. The 8-bit multiplier takes a little over 1000 ticks.

Quite amazing to watch!

Karl Scherer's

Karl's design has a single full adder and no explicit flip-flops. Instead there are loops of WireWorld cells that allow electrons to pass through random logic, signals being separated by 6 cells in each loop. There are 5 loops - one for the clock, one for each multiplier, one for the product, and one for a reset to a gating latch. Karl gives a 6-bit by 6-bit version that fits into the very small area of 49x49 cells. He did his design work in Zillions of Games. With n passes through the full adder, a product of 2*n bits, and 6-tick logic, the operation take about 12 n2 ticks. About a 20% improvement in time over Nick's design.

A quantum improvement in multipliers!

With Karl's permission, I'm showing his multiplier here, but in MCell format, to keep consistency with the rest of the pictures. His design was done in Zillions, and is presented at his web site using that format.

Karl Scherer's 6-bit, 6-tick Binary Multiplier


Mine

My designs are offshoots of Karl's. There is a single full adder and no explicit flip-flops for the data or clock. Two latches do exist, one for the carry signal and one for gating.

The main differences between my design and Karl's are the following

  1. 4-tick logic instead of 6
  2. Eliminating the reset loop
  3. Decreasing the length of one of the multiply loops by a factor of 2
  4. Minimizing the full adder logic
  5. Using a fill empty ring counter for the clock when n is large

Using 4-tick logic means the operation takes about 8 n2 cycles, about 1/3 faster than Karl's 6-tick design. The logic optimization also makes for a very compact machine.

Tiny and Fast!

Heise's 4-tick Multipliers

Description of the Logic

There are 6 components in each of the multipliers

  1. Clock, bottom right
  2. Multiplicand A, middle right. High order bit enters the multiplier first. First bit is actually outside the loop.
  3. Multiplier B, top right. Low order bit first.
  4. Product C, left. In expanded versions the loop is extended to better show the final result. Low order bit is very bottom right. Two bits per row.
  5. Full Adder, right of product loop
  6. Gate, right of full adder - a latch that allows B to FA when corresponding bit of A is a 1

Amazingly, the full adder generates a new partial product bit every 4 ticks. There are two inputs to the FA, PRODi and MULTBi. The FA also needs the carry bit, CARRYi-1, with is held in an RS-Latch within FA. The output of FA is the next partial product bit, SUMi. It must also update the CARRY latch. The equations are

SUMi = (PRODi xor MULTBi) xor CARRYi-1
CARRYset = (~(PRODi xor MULTBi) ^ ((PRODi xor MULTBi) + MULTBi) = PRODi ^ MULTBi
CARRYreset = ~((PRODi xor MULTBi ) + MULTBi) = (~ PRODi) ^ (~MULTBi)

These equations may appear clumsy, however, they obtain the desired result without the need for a cross element. The full adder was designed to minimized the area of the entire multiplier, not the area of the full adder. After many tries, these equations gave an implementation that allows the carry latch to be most advantagously placed, so as to allow the final sum bit to exit to the east.

A full adder is given at the end of this page that was designed with the criterion of minimum area.

Five versions of the multiplier are given: 5-bit, 6-bit, 8-bit, 16-bit, 32-bit, and 64-bit. Each one multiplies two n-bit numbers (A and B) giving a 2*n-bit product (C).

The 5-bit multiplier has a 40-cell product loop. I could not come up with a design whose product loop was 32, the next acceptable shorter loop length. This means a 5-bit by 5-bit multiply can be done in about 8 * 52 = ticks. (The actual number is 232, which includes the time for the first bit to propagate through the FA). A 4-bit by 4-bit multiply would take about the same time.

Larger multipliers are obtained by merely increasing the loop lengths. I did a 6-bit to compare to Karl's, an 8-bit to compare to Brian's and Nick's. I originally set out to get a 64-bit version, so just did all powers of two from 8 through 64.

The 32-bit version is the one that is posted on MathPuzzle.

There is one difference beginning at the 32-bit version. A fill-empty ring counter is used for the clock instead of a simple loop. The ring counter is about half the length of a simple loop, however, some logic is required to make it fill-empty and to decode the clock. In the 16-bit version and smaller the extra logic is greater than the savings of halving the length.

WireWorld Layouts

5-bit Multiplier


The MCell Source Code. You'll need Mirek's Cellabration to run it.



6-bit Multiplier


The MCell Source Code.



8-bit Multiplier


The MCell Source Code.



16-bit Multiplier


The MCell Source Code.



32-bit Multipliers



The MCell Source Code for output mostly ones and input ones..



64-bit Multiplier


The MCell Source Code.



Length, Area, and Time Parameters

nLoop Lengths AreaOp Time
Clock MultA MultB Product
536163640 22x38232
644204448 22x38316
860286064 23x41532
1612460124128 23x622036
32128124252256 23x928116
64256252508512 23x15732564

The length of the product loop is 8*n. The length of the MultB loop is 4 less than the product loop. The length of the MultA loop is half of 4 less than the MultB loop. The period of the clock is the same as the length of the MultB loop. So for small n its loop length is the same as MultB, but for large n, when a fill-empty ring is used, it's half of the product loop. The total time for the multiply operation is given by

OpTime = 8 * n2 - 4 * n + 52 This is actually less than 8 * n2 for n greater than 13.

A Minimum-Area, 4-tick Full Adder

The full adders in the above multipliers were designed to minimize the area of the multipliers, not the full adders. The fa given here is the minimum-area 4-tick design I came up with. It uses basically the equations of David Moore and Mark Owens' multipliers, but with 4-tick logic elements. Complex logic gates are used throughout.

The transit time is 21 ticks and it resides in a 14 by 16 rectangle, not including the top wire that is used only to aid viewing. Suprising, perhaps, this area is less than the those of the 6-tick versions found on net.

It is set up to do binary 1001011 plus 1001110 equals 10011001.

Fa's are also found on Karl's web site.

4-Tick Full Adder


The MCell Source Code.