Local Variable Library V1.0

For TurboForth V1.2

Written by Mark Wills

Document uploaded 19th May 2012

Note: This version is superceded by version 1.1, which can be found here.

1 - Introduction

2 - Declaring Local Variables in a Colon Definition

3 - Storing Data in your Local Variables

4 - Nesting and Scope

5 - Recursion

6 - Overhead
6.1 - Execution Overhead
6.2 - Memory Overhead

7 - Locals Dictionary

8 - Compile Time and Run-time Performance

9 - Calls to @LOCAL

10 - Writing to Local Variables Overhead

11 - Local Variables Source Code

12 - Test Code

13 - Example: DUMP using locals

1 - Introduction

With all the emphasis that Forth puts on the stack it could be argued that it goes against the grain somewhat to use local variables in Forth. However, the truth is that some Forth code can become an ugly mess, full of stack juggling words (known in the Forth world as stackrobatics) that juggle data on the stack to get it into the correct order for our program to use. When one thinks about it, those stackrobatics are a waste of processor cycles; we're burning processor time just juggling data on the stack, instead of actually getting our work done. That's not good.

A practice has evolved in Forth where stack items that are 'in the way' can be placed on the return stack temporarily (using >R) and removed later (using R>), which can sometimes be a neat solution. However, it has a couple of drawbacks:

Anything put on the return stack must be removed before the word exits, otherwise a crash will ensue;
Data cannot be placed on the return stack inside a DO or FOR loop (unless it is removed before a LOOP or NEXT) due to the return stack being used to hold the loop data for DO/LOOP and FOR/NEXT [1]

A neater solution is to implement local variables properly, rather than using the return stack, and to implement them in such a way that they interact with the data stack in a natural, Forth-like way. This author believes the following solution meets the above criteria.

The code presented here caters for up to fourteen named local variables per colon-definition. The local variables can have any name, but they should not be named after pre-existing words; in other words, don't name your locals DUP or DROP or SWAP. This is because during compilation of a colon-definition, the standard Forth dictionary is searched first, then (if the word was not found) the locals dictionary is searched. So, if your local has the same name as another word in the dictionary it will not be found. In any case, naming a local after a pre-existing word is just plain confusing. Don't do it!

2 - Declaring Local Variables in a Colon Definition

The word LOCALS{ is used to begin the definition of a list of local variables. Definition of locals is terminated with } as in the following example:

: computeArea ( w h -- area)
  LOCALS{ height width } 
  SET height   SET width
  height width * ;

In this example, two local variables are declared: height and width. The order in which local variables are declared is not important, since, unlike other local variable implementations, the locals are not initialiased from the data stack. Note that local variables are referenced in your code by their names. Naming a local variable in a colon definition causes its value (not its address) to be pushed to the stack. Local variables (in this implementation) work very similiarly to VALUEs.

3 - Storing Data in your Local Variables

Data is stored into your local variables from the data stack, with the words SET and +SET. SET and +SET are analogous to TO and +TO which are used with VALUEs.

Example:

    : TEST ( n4 n3 n2 n1 -- ) 
      LOCALS{ A B C D }

      SET D  SET C  SET B  SET A
      A . B . C . D .
    ;
    1 2 3 4 TEST
    1 2 3 4 ok:0

As can be seen, the local variables are populated directly from the data passed in on the stack. Note that the data is removed from the stack, as the local variables are loaded, as one would expect.

Note that, as shown in the stack signature, n1 was on the top of the stack when TEST was invoked, this was loaded into the local variable D with the phrase SET D, n2 was loaded into C, n3 into B and n4 into A.

Once the stack data has been stored in local variables, it can be accessed in any random order, simply by name; no stack juggling or use of the return stack is required. Note also that a local variable retains its value after use. So the following is valid:

    : TEST ( n -- ) 
      LOCALS{ A } 
      SET A  
      A A A A . . . . ;
	  
    99 TEST 
    99 99 99 99 ok:0

4 - Nesting and Scope

Words can be nested normally, as one does in Forth; words call words which call words etc. A words' local variables retain their values when it comes back into scope. For example:

: HARRY ( a b c -- ) LOCALS{ A B C } SET C SET B SET A CR ." In Harry:" C B A . . . ; : DICK ( a b c -- ) LOCALS{ A B C } SET C SET B SET A 7 8 9 HARRY CR ." In Dick: " C B A . . . ; : TOM ( a b c -- ) LOCALS{ A B C } SET C SET B SET A 4 5 6 DICK CR ." In Tom: " C B A . . . ; 1 2 3 TOM

In the above example, data is loaded into the local variables of TOM, but TOM immediately goes 'out of scope' as DICK is called (which subsequently calls HARRY). Note that when TOM eventually comes back into scope (after the execution of HARRY), the data loaded into its local variables is retained.

5 - Recursion

Recursion in TurboForth is achieved with the word RECURSE, which causes a recursive call to the word in which it occurs. For example:

0 VALUE DEEP TRUE UNSIGNED ! : FACT ( U -- U! ) LOCALS{ A } DEEP SET A 1 +TO DEEP DUP 0= IF DROP 1 ELSE DUP 1- RECURSE * THEN CR .S ." instance:" A . ; 8 FACT U.

When using recursion, it should be noted that each instance of the word will inherit its own set of local variables. Of course, the stack is used to communicate data into and out of the instances. The classic recursive factorial example above has been modified to show how deep the recursion goes. The current value of a counter is copied into local variable A of each instance. The value of the counter is increased with each instance of the word FACT. As the recursion 'un-winds' the value in the local variable is displayed. The following output is revealed:

    8 7 6 5 4 3 2 1 1 <TOP instance:8
    8 7 6 5 4 3 2 1 <TOP instance:7
    8 7 6 5 4 3 2 <TOP instance:6
    8 7 6 5 4 6 <TOP instance:5
    8 7 6 5 24 <TOP instance:4
    8 7 6 5 120 <TOP instance:3
    8 7 720 <TOP instance:2
    8 5040 <TOP instance:1
    40320 <TOP instance:0
    40320 ok:0

6 - Overhead

Execution Overhead

Overhead on the system is very small indeed. A call to (allotLocals) is compiled into the beginning of colon definitions, (after the call to DOCOL, but before any user executable code). Additionally, a call to (freeLocals) is compiled as the last word of a colon definition by semi-colon.

Memory Overhead

After the locals library is loaded, all colon definitions shall incur an extra 12 bytes of overhead. These are the calls to (allotLocals) and (freeLocals) which are compiled into all colon definitions as noted above.

All references to local variables are compiled as a literal (which represents an offset into the local variables stack) and a call to @local. References to SET and +SET are also compiled as a literal, which represents the same offset into the locals stack and a call to either (SET) or (+SET) respectively.

In addition, an area of memory is required to host the 'locals stack'. By default, this is currently set to >FFDE which is at the end of the 24K memory segment in the 32K memory expansion module on the TI-99/4A. Each call to (allotLocals) 'grows' the stack by n bytes, according to the literal preceding the call. Each call to (freeLocals) 'shrinks' the stack by n bytes in a similar fashion. The stack grows from higher memory addresses to lower memory addresses.

7 - Locals Dictionary

This implementations uses no dictionary space at all for the names of the locals variables. During compilation, the names of the locals are hashed and stored in a table beginning at >FFE0. Each local variable name is stored as a 16-bit unsigned hash, and stored in the table in the order in which they appear in the LOCALS{ definition. Thus their ordinal location in the table has a direct relation to their run-time offsets within the locals stack.

Once the colon definition containing locals has been compiled, the hashed locals are no longer needed, since the references to the locals within the colon definition have already been compiled into calls to @local.

8 - Compile Time and Run-time Performance

The code has been carefully designed such that, as far as possible, all time-consuming work is performed at compile-time rather than run-time, in order to minimise the performance penalty to a running progam that uses locals. FIND has been re-vectored, such that during compilation:

The normal dictionary is searched first
If (and only if) a definitions contains locals, and the word was not found in the normal Forth dictionary:
- The word under examination is hashed and the locals dictionary (a hash table) is searched
  - If the local is found in the locals dictionary/hash table then an appropriate reference is compiled directly into the colon definition
  - Otherwise, a 'word not found' error message is issued in the normal fashion

Since the normal Forth dictionary is searched first, there is no measurable increase in compile time for words that do not use locals. Words that do use locals are slightly slower to compile, since, in the event that the word being sought is not in the dictionary, it must be hashed (which takes a small amount of time) before it can be searched for in the locals dictionary/hash table, as described above.

9 - Calls to @LOCAL

When a reference to a local variable is compiled, a call to @local is compiled, as in the following example:

   : TEST LOCALS{ A B C } A B C . . . ;

The above code is compiled as follows:

   DOCOL
   LIT 3 (allotLocals)   \ reserve space for 3 cells on locals stack
   LIT 0 @LOCAL          \ fetch A
   LIT 2 @LOCAL          \ fetch B
   LIT 4 @LOCAL          \ fetch C
   . . .                 \ . . .
   LIT 3 (freeLocals)    \ de-allocate reserved space
   EXIT

As can be seen, the overhead to fetch a local is very small. 6 bytes. It is no larger (or slower) than fetching a value from an array using an address and an offset (which is exactly what happens within @local).

The definition of @local is simply:

	: @local ( index -- n ) \ fetch a local from the local stack
    	_LS + @ ;

Where _LS is the address of the top of the locals stack.

10 - Writing to Local Variables Overhead

Similarly, the overhead to write to a local variable is very small:

	: TEST LOCALS{ A B C } SET A   SET B   SET C   A B C . . . ;

Compiles to the following:

   DOCOL
   LIT 3 (allotLocals)    \ allot space on locals stack for 3 locals
   LIT 0 (SET)			\ set a with a value from the data stack
   LIT 2 (SET)            \ set b with a value from the data stack
   LIT 4 (SET)            \ set c with a value from the data stack
   LIT 0 @local           \ fetch a from locals stack and push to data stack
   LIT 2 @local           \ fetch b from locals stack and push to data stack
   LIT 4 @local           \ fetch c from locals stack and push to data stack
   dot dot dot            \ display the three values on the data stack
   LIT 3 (freeLocals)     \ de-allocate the space on the locals stack
   EXIT

The definition of (SET) is simply:

	: (SET) ( value offset -- )
    \ at runtime, set a local variable to value 'value'
  	_LS + ! ;

Whilst the definition of (+SET) is simply:

	: (+SET) ( value offset -- ) 
     \ at runtime add a value to a local variable
    	_LS + +! ;

11 - Local Variables Source Code

The following code (less comments and formatting) is available on block 37 of the utilities disk (available for download). It occupies 882 bytes of CPU memory.

\ An implementation of local variables.
\ Not ANS compatible.
\ Local variables are declared with the word LOCALS{ followed by a list
\ of variable names, followed by a closing }
\ For example:
\   TEST ( -- ) locals{ a b c } ... ... ... ;
\ The local variables are initialised to 0 upon creation.
\
\ Locals are referenced in code with their names.
\ Locals may be written to with SET and +SET. E.g.
\ : TEST ( x y z -- ) locals{ a b c } set c   set b   set a ;
\ The above example initialises the local variables a, b and c from the
\ data on the data stack. Z goes to c, y to b, and x to a.
\
\ Here is another example:
\ : TEST ( x y z -- z(x+y) )
\   locals{ x y z } set z  set y  set x
\   x y + z * ;
\
\ Where recursion is used with a definition that contains locals, each
\ instance of the definition shall inherit its own set of new locals.
\ These will be automatically de-allocated when the recursion un-winds.
\ Locals consume no dictionary space at all. Their names are temporarily
\ hashed during compilation only. After that their names are not required.
\ The hash table is set to the end of RAM (see dictAddr). There is
\ room for 14 locals per definition as currently set.
\ The locals stack sits immediately above the hash table and grows
\ towards lower memory addresses (the hash table grows to higher addresses).
\
\ During execution, locals add very little overhead: 1 call to allocate
\ the appropriate number of local-stack cells at the beginning of a colon
\ definition, and a similar call to de-allocate at the end of a colon
\ definition.
\ References to locals are compiled as literals representing an offset
\ into the locals stack, plus a call to @local

0 value locals?             \ true if a colon-def has locals
0 value localCount          \ number of locals in a colon def
0 value localOffset
$FFE0 VALUE dictAddr        \ address of start of local dictionary
$A006 @ VALUE _FIND         \ save contents of FIND vector
dictAddr VALUE _LS          \ top of local stack pointer
\ note: the locals stack and the locals dictionary grow away from each
\ other. There is a pre-decrement on local stack operations, therefore
\ it is safe to set the locals stack to the same address as the locals
\ dictionary, as they grow away from each other.

: (freeLocals) ( n -- ) \ runtime code to de-allocate n locals
    CELLS +TO _LS ;

: freeLocals ( n -- ) \ compile runtime code to free n locals
    COMPILE LIT , COMPILE (freeLocals) 
    FALSE TO locals? ;

: (allotLocals) ( n -- ) \ runtime code to allot n locals on locals stack
    \ stack grows towards lower memory addresses
    CELLS DUP NEGATE +TO _LS
    _LS SWAP 0 FILL ( intialise n locals to 0) ;

: allotLocals ( n -- ) \ compile run-time code to allot n locals
    COMPILE LIT ,   COMPILE (allotLocals) 
    TRUE TO locals? ;

: >HASH ( c-addr len -- u)
  \ hashes a string using the CRC-16 algorithm
  $FFFF             \ intial CRC16
  -ROT              \ move it out of the way
  OVER + SWAP DO    \ for each byte in the string
    I C@ XOR        \ xor with CRC16
    8 0 DO          \ for 8 bits in the byte
        DUP 1 AND   \ note the LSB prior to shift
        SWAP 1 >>   \ shift the CRC16
        SWAP IF 
            $A001 XOR \ if LSB was 1 then apply polynomial
        THEN  
    LOOP
  LOOP ;

: (LOCAL) ( addr len -- )
    ?DUP IF \ is a local. Add to fleeting locals dictionary:
        >HASH               \ hash the variable name
        dictAddr localCount CELLS + ! \ store hash in local dictionary
        1 +TO localCount    \ increment number of locals
    ELSE \ end of locals list
        DROP
        localCount allotLocals
    THEN ;

: LOCALS{ ( "name...name }" -- 
    0 TO localCount
    BEGIN
        BL WORD  OVER C@
        ASCII } - OVER 1 - OR
    WHILE               \ while | character not detected
        (LOCAL)         \ add local variable to locals dictionary
    REPEAT
    2DROP  0 0 (LOCAL)  \ end local dictionary processing
; IMMEDIATE

: @local ( index -- n )
    \ fetch a local from the local stack
    _LS + @ ;

: compileLocal ( -- )
    COMPILE LIT  localOffset 1- CELLS ,  COMPILE @local ;

: findLocal ( addr len - offset+1|0)
    \ search locals dictionary for word and return offset into
    \ locals stack+1 if found or 0 if not found
    >HASH 0 SWAP
    localCount 0 DO
        dictAddr I CELLS + @ OVER = IF
            SWAP DROP I 1+ SWAP LEAVE
        THEN
    LOOP  DROP 
    DUP TO localOffset ;

: localNotFound ( --)
    CR ." Error: Local not found."
    FALSE to locals? ABORT ;
    
: (SET) ( value offset -- ) 
    \ at runtime, set a local variable to to value value
    _LS + ! ;

: (+SET) ( value offset -- ) 
    \ at runtime add a value to a local variable
    _LS + +! ;

: doSET ( xt "local" value -- )
    BL WORD findLocal IF
        COMPILE LIT localOffset 1- CELLS ,  ,
    ELSE
        localNotFound
    THEN ;
   
: SET  ( "local" value --) \ write the value to the local variable
    ['] (SET) doSet ; IMMEDIATE
    
: +SET ( "local" value --) \ add the value to the local variable
    ['] (+SET) doSet ; IMMEDIATE

: ; locals? IF localCount freeLocals THEN [COMPILE] ; ; IMMEDIATE

0 value _addr   0 value _len
: FIND ( addr len -- cfa flag )
    2DUP  TO _len  TO _addr
    _FIND EXECUTE DUP 0= IF
        STATE @ IF
            locals? IF
                2DROP _addr _len findLocal IF
                    ['] compileLocal 1
                ELSE
                    0 0
                THEN
            THEN
        THEN
    THEN ;

' FIND $A006 ! \ re-vector FIND to use our FIND first

: test locals{ a b c } $BEEF set a  $FACE set b  $B00B set c 
  a $.  b $.  c $. $100 +SET c  c $. ;

: test ( a b c -- ) 
    locals{ a b c } set c  set b  set a
    cr ." a=" a .
    cr ." b=" b .
    cr ." c=" c . ;
1 2 3 test

12 - Test Code

The following tests the local variable code.

TEST executes a loop 10 times, showing the values of the four local variables, A, B, C & D each time through the loop. However, halfway through the loop, it calls TEST2, which executes 5 times. Note that TEST and TEST2 have different values in their local variables, and, crucially, note that when control returns to TEST from TEST2, the local variables re-inherit their original values. Thus, TEST and TEST2 have different local variables, accessed in the same way (A B C & D), which go in and out of scope as one would expect - in other words, they behave the same (in terms of scope) as local variables in any other programming language.

Which yields the following output:

Inside TEST:

    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    Inside TEST2:
      A=7 B=8 C=9 D=10
      A=7 B=8 C=9 D=10
      A=7 B=8 C=9 D=10
      A=7 B=8 C=9 D=10
      A=7 B=8 C=9 D=10
    Back inside TEST:
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4
    A=1 B=2 C=3 D=4

Thus showing that when TEST comes back into scope, its local variables have been retained.

13 - Example: DUMP using locals

The following is an example of how one might use locals in an implementation of DUMP. DUMP is used to display to the contents of memory on the screen. The address, the next 8 bytes of memory, and an ASCII representation of those 8 bytes are displayed. DUMP expects the address and number of bytes to display on the stack, as follows:

   DUMP ( address count -- )

For example:

   $6000 50 DUMP

: DUMP ( addr count -- )
  locals{ start_address end_address #bytes lz }
  $FFFE AND SET #bytes	       \ force count even and store
  $FFFE AND SET start_address  \ force address even and store
  start_address #bytes + SET end_address \ store end address
  ZEROS @ SET lz	       \ store leading zero indicator in lz
  TRUE ZEROS !                 \ turn on leading zeros
  end_address start_address DO \ from start address to end address
    CR I $.                    \ display address
    ASCII : EMIT SPACE         \ emit a colon and a space
    \ display 4 memory locations (8 bytes)...
    I 8 +  I DO                \ from address to address +8
      I @ $.                   \ fetch memory contents and display
    2 +LOOP
    \ display ascii contents of memory locations...
    I 8 +  I DO                \ from address to address +8
      \ display character if displayable, else display a dot...
      I C@ DUP 32 127 WITHIN NOT IF DROP ASCII . THEN EMIT
    LOOP
  8 +LOOP                      \ do next 8 addresses
  lz ZEROS !                   \ restore leading zero indicator
;

[1] This restriction does not apply universally to all Forth systems, but it does apply to TurboForth, which follows the 'classical' implementation architecture, which uses the return stack for nesting of loops.

Local Variable Library V1.0

For TurboForth V1.2

Written by Mark Wills

Table of Contents

1 - Introduction

2 - Declaring Local Variables in a Colon Definition

3 - Storing Data in your Local Variables

4 - Nesting and Scope

5 - Recursion

6 - Overhead

Execution Overhead

Memory Overhead

7 - Locals Dictionary

8 - Compile Time and Run-time Performance

9 - Calls to @LOCAL

10 - Writing to Local Variables Overhead

11 - Local Variables Source Code

12 - Test Code

13 - Example: DUMP using locals