Threaded code |
:: For multi-threaded programming, see Thread (computer science) .
In computer science, the term threaded code refers to an implementation technique for programming languages that produces very compact code. Threading is a form of code consisting entirely of subroutine calls, written without the subroutine call instruction, and processed by an interpreter (such as the Forth virtual machine), which jumps to each successive subroutine in turn.
Threaded code is used in the Forth programming language and early versions of the B programming language programming languages, as well as many implementations of Fortran, BASIC, COBOL and other languages for small minicomputers.
The benefits of extremely compact code, in some cases, come at the expense of slower execution speed. However, in other cases there is a synergy -- sometimes threaded code is smaller and faster than non-threaded code. In systems with virtual memory (where memory is simulated with a mechanical disk drive), threaded code may be hundreds of times faster than a less-compact design that does not fit in the available physical memory, because disk drives tend to be roughly a thousand times slower than random-access memory (RAM).
=History leading to threaded code=
In early computers, memory was very expensive. So programmers spent a lot of time trying to find ways to squeeze their programs so they would fit in the memory available. Also, the instruction sets were so primitive that even simple operations like printing a character or dividing one number by another number required quite a few instructions.
Instead of writing out every step of such operations in every part of the program where it was needed (possibly using a macro), programmers saved memory by writing out every step of such operations exactly once and placing it in a Subroutine.
This process (refactoring) is still commonly used today by all programmers and many wiki editors, although today we do it for different reasons.
The top-level application usually consists of nothing but subroutine calls. And many of those subroutines, in turn, also consist of nothing but subroutine calls.
Some early computers such as the RCA 1802 required several instructions to call a subroutine. In the top-level application and in many subroutines, that sequence is repeated over and over again, only the subroutine address changing from one call to the next. Using expensive memory to store the same thing over and over again seems wasteful -- is there any way to store this information exactly once
= Threaded code =
To save even more space, programmers squeezed those lists of subroutine calls into simple lists of subroutine addresses (leaving out the call instruction on each one), and used a small interpreter (later called a virtual machine) to call each subroutine in turn. This is called DTC (Direct Threaded Code) .
minicomputers, which have an indirection bit in every address. He said (in published remarks, Byte Magazine s Forth Issue) that he found it so convenient that he propagated it into all later Forth designs.
Some Forth compilers compile Forth programs into direct-threaded code. Other Forth compilers compile those same Forth programs into indirect-threaded code. The programs act the same either way.
= Later developments =
Not content to stop there, programmers have developed other related techniques to make programs even more compact, although they are slower than threaded code.
The idea of program compression has led to the idea of Algorithmic information theory.
=Threading models=
Practically all executable threaded code uses one or another of these methods for calling subroutines (each method is called a threading model ).
Less often used are *String threading: In which operations are identified by strings, usually looked-up by a hash table. This was used in Charles Moore s earliest Forth implementations and in the University of Illinois s experimental hardware-interpreted computer language. It is also used in Bashforth.
=Common amenities=
Separating the data and return stacks in a machine eliminates a great deal of stack management code, substantially reducing the size of the threaded code. The dual-stack principle was originated three times independently: for Burroughs computers, Forth and PostScript, and is used in some Java virtual machines.
Three processor register are often present in a threaded virtual machine. Another one exists for passing data between subroutines ( words ). These are:
Often, threaded virtual machines such as implementations of the programming language Forth have a simple virtual machine at heart, consisting of three primitives . Those are:
In an indirect-threaded virtual machine, the one given here, the operations are: next: (ip)+ -> w ; jmp (w)+ nest: ip -> -(rp) ; w -> ip ; next unnest: (rp)+ -> ip ; next
This is perhaps the simplest and fastest interpreter or virtual machine.
=External links=
*[http://cm.bell-labs.com/cm/cs/who/dmr/chist.html The Development of the C Language] by Dennis Ritchie also puts B in the context of BCPL and C. *[http://thinking-forth.sourceforge.net/ Thinking Forth Project] includes the seminal (but out of print) book Thinking Forth by [http://home.earthlink.net/~lbrodie/ Leo Brodie] published in 1984. *Anton Ertl s explanatory page [http://www.complang.tuwien.ac.at/forth/threaded-code.html What is Threaded Code ] describes different threading techniques and provides further references. Chuck Moore invented (indirect) threaded Code in 1970|
|