Assembly language |
Assembly language or simply assembly is a human-readable notation for the machine language that a specific computer architecture uses. Machine language, a pattern of bits encoding machine operations, is made readable by replacing the raw values with symbols called mnemonics .
For example, a computer with the appropriate processor will understand this x86/IA-32 machine instruction: 10110000 01100001 For programmers, however, it is easier to remember the equivalent assembly language representation: mov al, 0x61 which means to move the hexadecimal value 61 (97 decimal) into the processor register with the name al . The mnemonic mov is short for move , and a comma-separated list of arguments or parameters follows it; this is a typical assembly language statement.
Transforming assembly into machine language is accomplished by an Assembler, and the reverse by a Disassembler. Unlike in high-level languages, there is usually a 1-to-1 correspondence between simple assembly statements and machine language instructions. However, in some cases an assembler may provide pseudoinstructions which expand into several matching language instructions to provide commonly needed functionality. For example, for a machine that lacks a branch if greater or equal instruction, an assembler may provide a pseudoinstruction that expands to the machine s set if less than and branch if zero (on the result of the set instruction) .
Every computer architecture has its own machine language, and therefore its own assembly language. Computers differ by the number and type of operations that they support. They may also have different sizes and numbers of registers, and different representations of data types in storage. While all general-purpose computers are able to carry out essentially the same functionality, the way they do it differs, and the corresponding assembly language must reflect these differences.
In addition, multiple sets of mnemonic or assembly-language syntax may exist for a single instruction set. In these cases, the most popular one is usually that used by the manufacturer in their documentation.
=Machine instructions=
Instructions in assembly language are generally very simple, unlike in a high-level language. Any instruction that references memory (for data or as a jump target) will also have an addressing mode to determine how to calculate the required memory address. More complex operations must be built up out of these simple operations. Some operations available in most instruction sets include:
Some computers include one or more complex instructions in their instruction set. A single complex instruction does something that may take many instructions on other computers. Such instructions are typified by instructions that take multiple steps, may issue to multiple functional units, or otherwise appear to be a design exception to the simplest instructions which are implemented for the given processor. Some examples of such instruction include:
A form of complex instructions that has become particularly popular recently are SIMD operations that perform the same arithmetic operation to multiple pieces of data at the same time, which have appeared under various trade names beginning with MMX and AltiVec.
The design of instruction sets is a complex issue, with a simpler instruction set (generally grouped under the concept RISC) perhaps offering the potential for higher speeds, while a more complex one (traditionally called CISC) may offer particularly fast implementations of common performance-demanding tasks, may use memory (and thus cache) more efficiently, and be somewhat easier to program directly in assembler. See instruction set for a fuller discussion of this point.
=Assembly language directives=
In addition to codes for machine instructions, assembly languages have extra directives for assembling blocks of data, and assigning address locations for instructions or code.
They usually have a simple symbolic capability for defining values as symbolic expressions which are evaluated at assembly time, making it possible to write code that is easier to read and understand.
Like most computer languages, comments can be added to the source code; these often provide useful additional information to human readers of the code but are ignored by the assembler and so may be used freely.
They also usually have an embedded macro language to make it easier to generate complex pieces of code or data.
In practice, the absence of comments and the replacement of symbols with actual numbers makes the human interpretation of disassembled code considerably more difficult than the original source would be.
=Usage of assembly language=
There is some debate over the usefulness of assembly language. It is often said that modern compilers can render higher-level languages into codes that run as fast as hand-written assembly, but counter-examples can be made, and there is no clear consensus on this topic. It is reasonably certain that, given the increase in complexity of modern processors, effective hand-optimization is increasingly difficult and requires a great deal of knowledge.
However, some discrete calculations can still be rendered into faster running code with assembly, and some Low-level programming language programming is actually easier to do with assembly. Some system-dependent tasks performed by operating system simply cannot be expressed in high-level languages. In particular, assembly is often used in writing the low level interaction between the operating system and the hardware, for instance in device driver. Many compilers also render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes.
It s also common, especially in relatively low-level languages such as C programming language, to be able to embed assembly language into the source code with special syntax. Programs using such facilities, such as the Linux kernel, often construct abstractions where different assembly language is used on each platform the program supports, but it is called by portable code through a uniform interface.
Many embedded system are also programmed in assembly to obtain the absolute maximum functionality out of what is often very limited computational resources, though this is gradually changing in some areas as more powerful chips become available for the same minimal cost.
Another common area of assembly language use is in the system BIOS of a computer. This low-level code is used to initialize and test the system hardware prior to booting the OS and is stored in Read-only memory. Once a certain level of hardware initialization has taken place, code written in higher level languages can be used, but almost always the code running immediately after power is applied is written in assembly language. This is usually due to the fact that system RAM may not yet be initialized at power-up and assembly language can execute without explicit use of memory, especially in the form of a stack (computing).
Computer systems vendors may charge high prices for compiler language runtime libraries, thereby virtually assuring not every installation supports applications that are written in a particular language, except assembly language. Under this premise, assembly language is forced on Independent Software Vendors to keep the prospective buyer s costs down; what is good from a software engineering viewpoint is often bad for business.
Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose.
Assembly language is also the primary programming language of MenuetOS, a floppy-based system with a fully functional GUI. The author claims that only through assembly language could he produce his system in less than 1.4 megabytes.
=See also=
=Books=
*[http://cs.smith.edu/~thiebaut/ArtOfAssembly/artofasm.html The Art of Assembly Language Programming], by Randall Hyde *[http://www.computer-books.us/assembler.php Computer-Books.us], Online Assembly Language Books
=External links=
|
|
