Using inline assembly language in C programs with mspgcc

mspgcc tries to be largely compatible with the other C language toolchains for the MSP430. Inline assembly language is one area where this is impractical. mspgcc uses the usual GCC syntax for inline assembly language, with a few extensions to deal with MSP430 specific issues. At first sight GCC's way of handling inline assembly language may seem a little more difficult to use than some of the alternatives. It is, however, generally more efficient and powerful than those alternatives.

Inline assembly language syntax

mspgcc supports the standard GNU inline assembler feature 'asm'. In an assembler instruction using 'asm', you can specify the operands of the instruction using C expressions. This means you need not guess which registers or memory locations will contain the data you want to use.

You must specify an assembler instruction template much like what appears in an assembler language, plus an operand constraint string for each operand. For example:

asm("mov %1, %0": "=r" (result): "m" (source));
This could also be written:
asm("mov %src,%res": [res] "=r" (result): [src] "m" (source));
which may be clearer. Here 'source' is the C expression for the input operand while 'result' is that of the output operand. '=' indicates, that the operand is an output. m and r are constraints and inicate which types of addressing mode GCC has to use in the operand. These constraints are fully documented in the GNU GCC documentation.

Each asm statement is divided into four parts, by colons:

  1. The assembler instructions, defined as a single string constant:

    "mov %src, %res"

  2. A list of output operands, separated by commas. Our example uses just one, and defines the identifier "res" for it:

    [res] "=r" (result)

  3. A comma separated list of input operands. Again, our example uses just one operand, and defines the identifier "src" for it:

    [src] "m" (source)

  4. The clobbered registers. This is left empty in our example, as nothing is clobbered.

So, the complete pattern is:
asm((string asm statement) : [outputs]:[inputs]:[clobbers]);

Each input and output operand is described by a constraint string followed by a C expression in parantheses. msp430-gcc recognises the following constraint characters:

and some other constraints which are common to all processors supported by GCC. These constraints cause the compiler to automatically generate preamble and postamble code, allocate registers, and save and restore anything necessary to ensure the assembly language is efficiently and compatibly handled. For example
    asm("add %[bar],%[foo]"
        : [foo] "=r" (foo)
        : "[foo]" (foo), [bar] "m" (bar));
is equivalent to
foo += bar;
and will result in the following generated assembly language (assuming "foo" is a global variable)
    mov &foo, r12
/* #APP */
    add &bar, r12
/* #NOAPP */
    mov r12, &foo

If there are only unused output operands, you will also need to specify 'volatile' for the 'asm' construct. If you are writing a header file that will be included in ANSI C programs, use '__asm__' instead of 'asm' and '__volatile__' instead of 'volatile'.

A percent '%' sign followed by a digit or defined tag forces GCC to substitute the relevant operand. For 4 and 8 byte operands use the A, B, C, and D modifiers to select the appropriate 16 bit chunk of the operand. For example:

#define LONGVAL 0x12345678l

{
    long a,b;
    ...
    asm("mov %A2, %A0 \n\t"
        "mov %B2, %B0 \n\t" 
        "mov %A2, %A1 \n\t" 
        "mov %B2, %B1 \n\t" 
        : "=r"((long)a),"=m"((long)b)
        : "i"((long)LONGVAL) );
    ...
}
or
#define LONGVAL 0x12345678l

{
    long a,b;
    ...
    asm("mov %A[longval], %A[a] \n\t"
        "mov %B[longval], %B[a] \n\t" 
        "mov %A[longval], %A[b] \n\t" 
        "mov %B[longval], %B[b] \n\t" 
        : [a] "=r" ((long) a), [b] "=m" ((long) b)
        : [longval] "i"((long) LONGVAL));
    ...
}
This will result in something like the following generated assembly language (assuming 'a' is declared within the block, and 'b' is declared globally):
    ...
/* #APP */
    mov #llo(305419896), r12
    mov #lhi(305419896), r13
    mov #llo(305419896), 4(r1) ; mov #llo(305419896), &b
    mov #lhi(305419896), 6(r1) ; mov #lhi(305419896), &b+2
/* #NOAPP*/
    mov r12, 0(r1)
    mov r13, 2(r1)
    ...

So,

The I, J, K and L modifiers are similar, except they add 1 to an address or register. They should only be used in zero_extendMN operations.

There is also a %E modifier, which substitutes Rn from (mem:xx (reg:xx n)) as @Rn. This is a useful modifier for the first element on the stack or for pointers. !!! Do not use this unless you know exactly what are you doing !!!