The GCC6809 Manual

Table of Contents

Introduction to GCC6809

This is the manual for GCC6809, the port of the GNU C Compiler for the Motorola 6809 processor. This document describes version 4.3.4-1. It does not cover all aspects of GCC in general, for which you should refer to the regular GCC manual.

GCC6809 is developed by Brian Dominy brian@oddchange.com and is licensed under the GNU General Public License version 3 or later. The latest version of the software can be obtained at http://www.oddchange.com/gcc6809.

1 Installation

This chapter explains how to build and install GCC6809.

GCC6809 is distributed as a source patch against the mainline GCC sources. It has not been integrated into the official releases, so you must first obtain the regular GCC sources, and then apply a patch which adds all of the 6809-related content. Then you build and install the compiler itself.

1.1 Patching the Sources

Because GCC6809 is just a patch file, you'll first need to obtain an official release of GCC in source form, unpack it, and apply the 6809 patch to produce the final source tree. In order to build the cross-compiler, you'll need a working native compiler and some other tools that are needed to build GCC, like flex and bison.

You can obtain the regular GCC sources from http://gcc.gnu.org/mirrors.html, which has links to many download locations. Look for a file named gcc-4.3.4.tar.bz2 or similar. Make sure that the source tarball and the patch have the same version number, aside from the patchlevel which follows a dash. Once you have the GCC source tarball and the GCC6809 patch, do the following:

  1. First, extract the contents of the source tarball, using gunzip or bunzip2. For example,
              bzcat gcc-4.3.4.tar.bz2 | tar xvf -
    

    This will create a new directory named gcc-4.3.4 which has all of the pristine sources.

  2. Apply the 6809 patch using the patch command. You need to be in the newly created directory, and you need to use the -p1 option to patch:
              cd gcc-4.3.4
              patch -p1 < /path/to/gcc-4.3.4-1.patch
    

    You'll see a few messages saying that some files are added/modified. You should not see any conflicts or errors.

1.2 Building the Compiler

Once the patch is installed, you're ready to build the compiler.

Building a cross-compiler can be a daunting task. To aid with this, the patch includes a simple Makefile with some basic command sequences for doing the most common things. This Makefile is installed in the build-6809 subdirectory. The easiest way to build and install GCC6809 is to switch to this directory and run make. In most cases, this should be all you need.

This convenience Makefile has some other targets that you can invoke as well:

asm
Build only the ASXXXX programs.
asm-clean
Clean the ASXXXX programs.
asm-install
Install the ASXXXX programs.
binutils
Install the wrapper scripts for the ASXXXX programs.
clean
Clean all of the compiler object files, but keep the output from configure.
config
Only configure the compiler, but do not build it.
distclean
Like clean, but also remove the configuration. This will force a complete rebuild on the next make attempt.
everything
Clean, rebuild, and reinstall everything, including the assembler tools, the frontend scripts, and the compiler, in one step. This is the easiest way to build everything from scratch, although it may take a little longer than necessary.
test
Run the GCC testsuite. This requires that you downloaded the GCC source tarball which included the testsuite files and not just the compiler sources.

You can further control the build in a variety of ways by setting environment variables before invoking make. This is not normally needed, however.

AS_VERSION
Overrides the version of the ASXXXX programs to use. Use this if you have your own version of these tools which you prefer to install instead. The default is to use the version included in the source patch.
BUILD_DEBUG_CFLAGS
Overrides the CFLAGS used for building the compiler itself. The default builds an optimal compiler without debugging support. You would only want to change this when debugging the compiler.
GCC_LANGUAGES
By default, only the C frontend is built. You can build a C++ compiler as well by setting GCC_LANGUAGES=c,c++. However, note that the standard C++ library is not supported. Other GCC frontends like Ada and Fortran are not supported.
MAKEFLAGS
This is a special variable understood by make. In particular, you might want to set MAKEFLAGS=-j3 if you have a multiprocessor build machine. This will allow to parallelize the build across 2 processors. See the make documentation for more information.
prefix
Specifies where to install the compiler. The default is /usr/local. This value is used during the configuration and install steps.
SUDO
By default on Linux, the Makefile will try to invoke sudo when installing anything, which normally requires root privileges. If you are installing to a user subdirectory, this is not necessary. Say SUDO="" in such cases, or when sudo is not installed for some reason. On Mac OS X and Cygwin, sudo is never tried.
target_vendor
Specifies the vendor variation. The default, ‘unknown’, builds a generic compiler that will generate code for any platform. Use ‘coco’ when building a compiler to generate user programs for the Tandy Color Computer.
target_os
Specifies the operating system which generated programs will run on. The default, ‘none’, builds a compiler that will not make any OS assumptions. In particular, the startup and termination code generated will assume bare-metal: no operating system at all.

1.3 The Assembler Tools

GCC itself is only a compiler, which translates C programs into assembly language. Additional programs are required to turn the assembly language into executable code. In the GNU world, these programs are called the binutils. They mainly consist of three programs: an assembler, a linker, and a library manager, also called an archiver.

The regular binutils distribution does not support the 6809 processor. GCC6809 uses ASXXXX, which provides all of these capabilities. These programs are included with the 6809 source patch, and are named as6809, aslink, and ar6809. The archiver is not part of the current ASXXXX distributions but was developed for an ancient version of GCC6809.

Because the ASXXXX programs have different command-line options than their GNU counterparts, some wrapper scripts are included with GCC6809. These scripts, named ar, as, and ld, accept the standard GNU options, and then invoke the underlying ASXXXX programs with their corresponding options. You should normally never need to invoke as6809, etc. directly.

1.4 Target Types

As a compiler, GCC6809 will work for any type of hardware or operating system with no modifications. However, if you want to generate binaries in a particular format, this is system-dependent. GCC can be modified to support these variations through the use of target types.

The default target type is ‘m6809-unknown-none’. The middle component is the vendor, which refers to the hardware. The last component is the os, which describes the software running on that hardware.

m6809-unknown-none’ actually targets the exec09 simulator, which runs freestanding (non-OS) programs. The implications of using this target type are as follows:

If you are compiling for a platform other than the simulator, then much if not all of this is incorrect. You have two basic options:

  1. Don't use gcc to link your programs; use ld directly. This will avoid bringing in the predefined functions and interrupt table. You can use the --section-start option to specify the base address of any program section. If you are writing a freestanding program (one that does not require an OS), you will probably need to write a vector table.
  2. Define a new target type, and modify GCC to generate the binaries correctly. You'll need to change gcc/config/m6809/crt0.S for the startup sequence, binutils/ld for the default linker options, and config.sub to recognize the target type name. The makefile fragments in gcc/config/m6809 will probably need changing as well.

    A secondary target type, ‘m6809-coco-none’, has been partially written to show how this can be done.

2 Data Types

An int is 16 bits, or 2 bytes wide. short or char can be used to refer to an 8-bit quantity. Plain char is unsigned.

A long is 32 bits, or 4 bytes wide. Longs are never kept in registers more than necessary, as there simply aren't enough registers.

There is no support for long long.

Optionally, you can make integers 8 bits wide, by using the -mint8 command-line option. This also shortens the size of "long" to 16 bits. It does not affect short or char. It is strongly recommended that you don't do this unless you know what you are doing! When using this option, you can't rely on the C library to work correctly, because the C standard explicitly does not permit this behavior. However, you can get more optimal code generation with it if you are using a lot of 8-bit data.

All pointers are 16 bits wide. (A pointer is not necessarily the same size as int, if you use -mint8.)

Floating point is supported although it is not thoroughly tested outside of the normal GCC testsuite. FP performance is weak because 32-bit math is weak.

3 Calling Conventions

GCC6809 supports several different calling conventions: ways of using registers to make function calls. You can explicit select one via the -mabi_version option, or you can take the default, which will generally be the most efficient method implemented to date.

If developing a standalone program that doesn't interact with any libraries or precompiled sources, the default is probably the best option. However, when integrating with existing code, you probably need to use the conventions already in place.

3.1 Function Arguments

The default convention, called ‘regs’, will place the first 8-bit argument in the B register, the first 16-bit argument in the X register, and all other arguments on the stack in reverse order.

An alternate method, called ‘stack’, places all arguments on the stack. (Note: a function that accepts a variable number of arguments is always treated this way, in any mode). This was the default in very early GCC6809 incarnations and can be specified via -mabi_version=stack.

In -mint8 mode, 8-bit arguments are not promoted. In the default -mint16 mode, 8-bit arguments are also promoted to 16-bits when passed as arguments.

The reason that X is used for 16-bits is that most 16-bit values tend to be pointers instead of simple data values. Using X right away allows access to all of the addressing modes on the 6809. (Earlier versions of the compiler used 'D', and the generated code contained lots of tfr d,x instructions.)

3.2 Function Return Values

An 8-bit return value is placed in B; a 16-bit return value is placed in X. If you want 16-bit return values in D, use the -mdret option.

Larger values, including structures, are returned by having the caller pass in a pointer to the location where the result would be copied.

4 Registers

The next few sections describe how the compiler uses registers for purposes other than function arguments and return.

4.1 Register Clobbering

The B and X registers are assumed to be clobbered by any function calls, since return values and argument values may be placed there. Temporary values are never held in these registers across a function call; they will be moved to the stack or soft registers instead.

The Y and U registers are assumed to be preserved across a function call. Thus, if a function wants to use those registers, it will save/restore them. GCC does this in the prologue/epilogue for each function when it is necessary. The only exception is when a function has been tagged as naked, then this will not be performed.

If you are embedding assembly language inside C functions, or calling between C and assembly language functions, you should understand this behavior carefully. You cannot assume that a value in register D or X has the same value after making a function call. If you depend on the value of the CC or DP registers, you must deal with that explicitly, as GCC does not track them at all.

4.2 Register Allocation

The 6809 has a fairly small register set. This makes it tough for GCC to generate optimal code, because its internal algorithms assume the availability of plenty of registers. This design choice works well for modern day processors, though. Very rarely, some complex expressions may cause the compiler to abort because there aren't enough registers for temporaries.

GCC does not ever use the 'A' accumulator for temporaries. GCC treats D as a generic, 16-bit register, and renames it to B when only 8 bits of it are relevant. Even when no 16-bit math is required, GCC still cannot use 'A' as a separate register from 'B'. (This is a limitation which, if removed, would bring some nice performance enhancements for 8-bit calculations in -mint8 mode.) The 'A' register is used internally for some canned instruction sequences, but it cannot serve as a general-purpose 8-bit register.

You can use the 'A' register in assembly macros. This is useful for time-critical code in which using the register is faster than using temporaries on the stack. You must be sure that GCC does not use the 'D' register while doing this, though. Use the asm attribute on a data declaration to do this.

The U and Y registers are used as general-purpose, 16-bit registers, and can be allocated for local variables. Note that 8-bit local variables cannot be assigned to these variables. In some cases, you can get better performance for a local by extending its type (manually) from 8-bit to 16-bit to force allocation to one of these registers. U is always preferred over Y since instructions using Y are all one-byte longer, and thus take one cycle longer, too.

The S register refers to the program stack and is used to reference local variables, function arguments, and compiler-generated temporaries.

4.3 Soft Registers

If you have a need for lots of registers, due to complex calculations or large functions, GCC tends not to generate the best code. However, it can be improved by enabling "soft registers". These are memory locations, accessed in the direct addressing mode, which appear to GCC as registers.

GCC6809 can use up to 8 soft registers; the memory locations are named *m0 through *m7. Use the -msoft-reg-count option to enable their usage when compiling a file.

Most of the passes of GCC expect to have a large number of registers. When it runs out, any remaining values must be "spilled" to the stack. This creates several problems in embedded environments. First, stack load/store instructions can be slow, because it requires indexed addressing. Second, systems may have little RAM. Third, in multiprocessing systems which need to save/ restore the stack, keeping the stack smaller will produce faster code.

Soft registers help in all of these areas. Direct mode accesses are shorter in both length and execution time. Because they are global, and not relative to the stack frame, less overall memory is used. Finally, because GCC is unaware that these registers are actually in memory, it can still perform some optimizations on values in these locations which aren't possible once values are spilled to the stack (just because of the way GCC is written).

Note that all soft registers are considered call-clobbered. GCC6809 will not save/restore any soft registers in the function prologue/epilogue, as it would be too slow. Consequently, values can't be kept in soft registers across a function call.

You do not need to declare the memory locations for the soft registers. They will be generated for you. They are part of the GCC runtime library.

5 Interrupts

Ordinary C functions can be used to implement interrupt handlers. These functions should be declared with the interrupt attribute to force emitting an rti instruction at function exit.

The interrupt vector table can be declared as an ordinary structure of function pointers; use the section attribute to force it to be placed into a separate section, such as "vector". This section should then be mapped to address 0xFFF0 at link time.

When using the default target ‘m6809-unknown-none’, which builds for the simulator, you will automatically get an interrupt table with default vectors. When compiling for actual hardware, you should override this definition with your own vectors.

6 Libraries

The ar program supplied with GCC6809 is not very robust. It has some limitations which you should be aware of.

6.1 The GCC Runtime

Certain instruction sequences are so lengthy that the most efficient way to generate code is to use a subroutine. For example, 16-bit multiplication requires at least 9 instructions, and code that performs lots of multiplication would be very large if these were all done inline.

After building the compiler itself, the build procedure also compiles a target library, named libgcc.a, which contains these types of functions. GCC may always emit calls to this library, so it is required. Many 6809 code sequences have been added to this; see the file gcc/config/m6809/libgcc1.s for its contents.

If you use ld to link instead of the gcc driver, you will need to make sure to pass the -lgcc option to link in this library yourself. Failure to do so may result in undefined functions if anything in it is required.

6.2 The C Library

A port of newlib, an implementation of the Standard C Library, is available for separate download. Like the GCC runtime, some of the functions are system-dependent. Target types are used to build the library, and the default target type produces programs which works under the simulator.

When linking with gcc, the -lc option is automatically added so that C library function calls will be resolved, unless you link with the -nostdlib option. If you don't have the C library, you probably also want to compile with -ffreestanding, which prevents GCC from emitting calls to some C functions like memcpy.

6.2.1 Machine Subroutines

The directory machine/m6809 contains processor-specific functions. The 6809 port only adds two functions here, an implementation of setjmp and longjmp.

6.2.2 System Functions

The directory sys/m6809sim contains the UNIX system call functions, such as write and kill. For the simulator, most of these are stubs. The file-related functions do support standard input and standard output, and read/write special I/O addresses which the simulator understands. The process-related functions do nothing and assume a single-tasking environment.

There is a sys/coco implementation as well that is used when building the ‘m6809-coco-*’ target type.

7 Startup and Termination

A C program traditionally begins with a call to its main() function. However, all programs actually start at a system-dependent function that runs prior to main, which performs some setup first. Likewise, when returning from main or calling exit(), there is some shutdown code that is also run.

The startup and termination code is part of the GCC runtime, and is defined in the object file crt0.o. What this code should do is completely dependent on the target type.

If you want to use GCC6809 for a particular OS with special start and exit requirements, you should consider modifying crt0.o to support your platform.

8 Bank Switching

Because the 6809 has a native 16-bit address space, only 64KB of code and data can be addressed at a time. On most platforms, some of this space is reserved for I/O, limiting the amount that can be used by the compiler even further.

Many architectures allow for some form of bank switching, where a larger amount of memory exists, but only portions of it are accessible at a time. GCC6809 has some builtin support to make it easier to work with systems that support bank switching. However, currently it can only help with switching out code pages. Data accesses across different banks are not handled automatically, and require explicit instructions from the programmer.

8.1 Hardware Requirements

Bank switching can only work when the underlying hardware supports it. GCC does not care how the hardware works, aside from a few rules:

8.2 Declaring and Defining Banked Functions

When an ordinary jsr instruction will not suffice, GCC can emit what is called a far call. To determine when a far call is needed, GCC must know the bank number (if any) of the caller, and that of the callee.

The caller bank is identified from the -far-code-page command-line option. All functions in the same source file are assumed to be in the bank number identified by this option. If not stated, then all of the functions are assumed to be in a fixed section.

The callee bank is identified by using the far attribute on the declaration of that function. Remember that declarations occur in header files, not in source files. If you do not provide a prototype for a function, GCC assumes it is in a fixed section.

8.3 Making Far Function Calls

When GCC needs to emit a function call, the target function is in a banked section, and the caller is not in the same bank, then a special call sequence is emitted which looks like this:

     jsr __far_call_handler
     	.dw target_function
     	.db target_bank

Here, target_function is the function being called, and target_bank is the bank number where that function is located, according to the far attribute in its declaration.

The function __far_call_handler is system dependent. GCC6809 does not provide a default farcall handler; you must write it yourself. This function should:

It is practically impossible to write the function in C to meet all of these requirements; writing it in assembly language is much easier.

Here is the far call handler used on the FreeWPC platform:

     		.area direct
          	__far_call_address:
          		.blkb 2
     
          .area .text
          	.globl __far_call_handler
          	__far_call_handler:
          		pshs  b,u,x                 ; Save all registers used for parameters
          		ldu   5,s                   ; Get pointer to the parameters
          		ldx   ,u++                  ; Get the called function offset
          		ldb   ,u+                   ; Get the called function page
          		stu   5,s                   ; Update return address
          		stx   *__far_call_address   ; Move function offset to memory
          		lda   IO_BANK_REGISTER      ; Read current bank register value
          		stb   IO_BANK_REGISTER      ; Set new bank register value
          		puls  b,u,x                 ; Restore parameters
          		pshs  a                     ; Save bank switch value to be restored
          		jsr   [__far_call_address]  ; Call function
          		puls  a                     ; Restore A
          		sta   IO_BANK_REGISTER      ; Restore bank register
          		rts

GCC does not emit the far call sequence when the caller and callee are both in the same bank; in that case, the bank switch register is already correct.

GCC will raise a fatal error if one or more of a far call target's parameters needs to be pushed onto the stack. The thinking is that the farcall handler cannot save the previous value of the bank switch register without also pushing onto the stack, which causes confusion about where the arguments are actually located. If your farcall handler doesn't use the call stack (for example, it saves into a separately declared stack), then you can turn off this error by compiling with -mfar-stack-param.

9 Command-Line Options

In addition to all of the standard GCC command-line options, these new ones have been added for the 6809:

-mint8
Says that int should be 8-bits wide, instead of the default 16-bits. You can explicitly request 16-bit integers with -mint16.
-mnodirect
Says not to use the direct addressing mode, even if a section named direct is used.
-mdret
Uses the D register instead of the X register for 16-bit return values.
-mwpc
Enables the Williams Pinball Controller (WPC) extensions. This allows certain bit shifting operations to make use of the ASIC on WPC boards which can do the arithmetic faster than native 6809 code.
-mfar-code-page=val
Sets the code page, or bank number, associated with the code in this file.
-mcode-section=name
Sets the default section (area) name used for code. The default is .text.
-mdata-section=name
Sets the default section (area) name used for initialized data. The default is .data.

The section for any data item can be overriden by using a section attribute.

-mbss-section=name
Sets the default section (area) name used for uninitialized data. The default is .bss.

The section for any bss item can be overriden by using a section attribute.

-mabi-version=name
Selects the ABI to use, which governs how registers are used for function calls. Valid choices are:
stack
All arguments are placed on the stack.
regs
bx
The first 8-bit argument is placed in the B register, and the first 16-bit argument is placed in the X register.
latest

-msoft-reg-count=val
Specifies the maximum number of soft registers that may be generated.
-m6309
Enables 6309 extensions. Not fully implemented. This requires support for the 6309 in the assembler, which is not present yet.

Of the many standard GCC options, the following may be particularly useful for the 6809:

-fpic
Generate position-independent code (PIC). All function calls will use the lbsr instruction; all branches will use bra or lbra instead of jmp, and all data accesses will use PC-relative addressing. Use this option to compile library functions when the memory location for the code and data is not known at link time.
-save-temps
Save the assembler output in a .s file, in addition to generate an object file. This is useful for seeing what the compiler is doing.

10 Predefined Macros

The following macros are predefined by the compiler based on command-line options:

__m6309__, __M6309__
Defined when compiling for the 6309 instruction set.
__m6809__, __M6809__
Defined when compiling for the 6809 instruction set.
__int8__
Defined when integers are 8-bits long.
__int16__
Defined when integers are 16-bits long.
__ABI_STACK__
Defined when using the ‘stack’ ABI.
__ABI_REGS__
Defined when using the ‘regs’ ABI.
__ABI_BX__
Defined when using the ‘BX’ ABI.
__WPC__
Defined when the WPC extensions are enabled.
__DRET__
Defined when using the D register to return 16-bit values.

11 Attributes

Attributes are a GCC extension that allows declarations to be annotated with additional properties outside of the C language specification. GCC6809 defines some new attributes for features that are helpful on 6809 machines.

To apply an attribute to a function declaration/definition, use the following syntax:

__attribute__((attr_name)) void f (...);

If an attribute takes parameters, then use the following form:

__attribute__((attr_name(arg))) void f (...);

Here are the 6809-specific attributes:

far
Defines the code bank number on a function declaration. Although it must be numeric, write the argument as a string inside double quotes here.
interrupt
Marks a function as an interrupt handler.
naked
Says not to generate any prologue or epilogue code. Use only when you are sure that such code is not required. If the function actually does use a call-preserved register or requires local stack space, this may cause unpredictable behavior.

Here are some generic GCC attributes which are particularly useful on the 6809:

noreturn
Declares that a function does not return. If such a function ends by making a function call, the compiler will emit a jmp instruction for this rather than a jsr.
section
Overrides the default section in which an object is located. This can be used for both code and data.

The name direct is special and should only be used for data declarations. Direct variables are intended to be linked into the zero page. GCC will prefix the variable name with an asterisk, which tells the assembler to use the direct addressing mode. This generates a shorter and faster instruction. Direct variables should be used sparingly, and only for the most frequently used variables.

It is up to you to ensure that direct variables actually get linked correctly, and that the DP register points to the region where they are located. This happens automatically when using the simulator; DP is initialized to 0 in the startup code, and the variables are linked starting at addressing 0x0000.

12 Builtins

Some 6809-specific opcodes can be accessed through C-like functions, called builtins. These are sometimes called intrinsics in other compilers.

As an alternative to these functions, you could also just use inline assembler statements.

__builtin_swi()
Emits the swi instruction.
__builtin_swi2()
Emits the swi2 instruction.
__builtin_swi3()
Emits the swi3 instruction.
__builtin_cwai()
Emits the cwai instruction.
__builtin_sync()
Emits the sync instruction.
__builtin_nop()
Emits the nop instruction.
__builtin_blockage()
Does not emit an instruction, but inserts an invisible "barrier". The compiler will not optimize the code by moving instructions across the barrier.

13 The Simulator

The exec09 simulator lets you run 6809 programs on your UNIX-like machine. It was developed to test that the compiler output was correct. The simulator is distributed separately from the GCC6809 patch.

The simulator is configured using Autoconf like most of the standard GNU tools.

13.1 Command-Line Options

13.2 Interactive Debugging

13.3 The Simple Architecture

simple is the default architecture and is the one that the ‘m6809-unknown-none’ targets. It simulates a machine which:

13.4 The EON Architecture

EON is a more advanced machine which was developed to experiment with complex operating systems on the 6809.

13.5 The WPC Architecture

WPC is the hardware used by many pinball machines during the 1990's. The WPC simulation is primitive compared to that of existing emulators like PinMAME.

14 Helpful Hints