Tuesday, March 30, 2010

Matrix Multiplication example ported to arrayForth

I'm going to put up a better description later. Right now I just want to get this up on the web. It works! (Took a while). The source HTML was generated by the built in arrayForth HTML generator.

To run the code put the following at the top of block 1302:

0 node 200 load 1 node 204 load 2 node 206 node 3 node 210 load exit

Note that the multicore matrix example is in blocks 200 to 208 and uses nodes 0, 1 and 2.  Block 210 does the same thing with a singlecore (node 3).  The multicore example finishes in 1479 cycles and the singlecore example takes 3040 cycles.  I verified that the multiplications are correct using the statistical language R.  At some point I plan to make a video explaining how this all works, but again right now I just want to get the results up.


































multicore matrix multiplication example cr
a 2 row by 8 col matrix is multiplied by cr
a 2 col by 8 row matrix cr
resuting in vector r1*c1 r1*c2 r2*c1 r2*c2 cr
cr
r1out send vector r1 to right node cr
r2out send vector r2 to down node cr
rtdwn set port to right and down cr
synch dummy write to synchronize nodes cr
c1c2out send vectors c1 and c2 to right indent
    
and down nodes cr
getres get results from right and down vectors

  200 list

0 org r1out right b! 7 for @p !b unext cr
1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , cr
r2out down b! 7 for @p !b unext cr
9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , cr
rtdwn B5 b! cr
synch 0 !b cr
c1c2out 15 for @p !b unext cr
17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , cr
25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , cr
getres right b! @b @b down b! @b @b r---      


invect a- read in vector and store at a
*
ab-a*b 17 bit multiply
.prod
a-n read in vector and do dot product wi
th vector stored at a

r*c1,c2
p- read r from port p then read c1 and
c2 and do dot products with r. return results
to port p                                     

  202 list

invect a! 7 for @b !+ unext ;
*
a! dup dup or 17 push . begin +* unext drop
drop a ;

.prod
a! dup dup or 7 for @b @+ a push * + pop
a! next ;

r*c1,c2
b! 60 invect cr
synch @b drop cr
60 .prod 60 .prod !b !b ;                     


                                              

  204 list

0 org 202 load 69 org right r*c1,c2 r---      


                                              

  206 list

0 org 202 load 69 org down r*c1,c2 r---       


                                              

  208 list

0 org
*
a! dup dup or 17 push . begin +* unext drop
drop a ;
cr
69 org 3 3 * r---                             


singlecore matrix multiplication example cr
* ab-a*b 17 bit multiplication
*sum
v2ofs-n vector v1 is stored in a and cr
v2ofs is the offset to vector v2. sum the cr
products of v1 and v2.                        

  210 list

0 org
*
a push a! dup dup or 17 push . begin +* unex
t drop drop a pop a! ;

*sum
dup dup or 7 for over a + b! @+ @b * + ne
xt push drop pop ;
69 org cr
40 a! 10 *sum 8 *sum 40 a! 18 *sum 10 *sum r--
-
40 org cr
1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , cr
9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , cr
17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , cr
25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , cr