\chapter{Porting Mach to the ARM}


\section{Goal}
The three critical components to getting the GNU system
running on new types of machine are GCC (the GNU C
Compiler), binutils (the GNU binary utilities) and Mach. 
GCC and binutils are already widely ported, so the obvious
next step is to assess the portability of Mach.  Mach as
distributed by GNU supports only the i386 (and similar)
processors.  In order to determine how easy it is to port
Mach, I decided to attempt to port it to run on the ARM
architecture which I am reasonably familiar with programming.


\section{History of Mach}

Mach was first described in a paper \cite{machorig} to Usenix
in 1985.  It was developed at Carnegie Mellon University to
form the base for their operating system research.  It was
initially based within 4.2BSD, replacing its components with
Mach components as they were completed.  When 4.3BSD was
released, the remaining BSD components were updated.  The first
release of Mach, Release 0 took place in 1987.  Several more
releases followed until Release 3 in 1990, by which stage the
BSD components had all been removed from the server to run as
a single-server on top of the `bare' Mach microkernel.  The
University of Utah took over development of Mach in 1995 to
form the basis of their research into operating systems and
added new features.

\begin{itemize}
\item Additional device drivers (ported from Linux)
\item Migrating Threads
\item Kernel Activations
\item Presentation/Interface RPC
\end{itemize}

They released Mach 4 in 1996.  They have since continued in
their research with a project called Fluke.  GNU now distribute a
version of Mach that is based upon the Mach4 release from Utah.

Mach has been used as the basis for commercial operating
systems; Version 2.5 was used as the basis of the NeXTStep
operating system and also the basis of OSF/1 Unix.  The OSF
still maintain a separate copy of Mach, now based on Mach
Version 3.  Recently, the OSF have ported Linux to run on
top of their version of Mach.

\includegraphics*[0mm,0mm][180mm,170mm]{hist.ps}


\section{Support for the ARM}

Most GNU tools support the ARM, including GCC (The GNU C
Compiler), binutils and glibc\footnote{Currently only in
developmental versions}.


\section{ARM support for Mach features}

Mach can be thought of as providing a virtual machine to tasks
which run on top of it.  This section shows how some of the
concepts which are part of the virtual machine can be
implemented on the ARM architecture.

\subsection{Tasks and Threads}

In Mach, the notion of a process is split into a \emph{task} which
is a container for all the resources that are allocated to the
process and \emph{threads} which act as points of control for the
process.  A thread is a very lightweight entity then, consisting
only of its register state, some thread-specific communication
port rights, scheduling state and any statistics which the kernel
is collecting.  Tasks contain much more state; they contain
threads, have an associated address space, hold a set of port
rights and can intercept and system calls made by threads.

%\ Communications Ports
%\item Messages
%\item Memory Objects
%\item Processors

\subsection{Device drivers}

There are several possible approaches to the design of the
\arm{irq} handler.  The one used in Acorn's RISC OS \cite{prm}
is to simply call the device driver directly, with the
processor remaining in \arm{irq} mode and further interrupts
disabled.  If the device driver thinks it will take an
unacceptably long time to execute, it may reenable interrupts,
but must then be able to cope with being called reentrantly.
A better solution is to use a queue of pending interrupts.
In this system, interrupts are left disabled for only very
short periods of time during the kernel interrupt handler,
the device drivers are called with interrupts enabled and
are called in \arm{svc} mode instead of \arm{irq} mode.  The
kernel \arm{irq} handler can then protect the device drivers
against being reentered by allowing the existing instance to
complete and then calling it when it finishes dealing with the
old \arm{irq}.

A consequence of the device driver being called with \arm{irq}s
enabled means that a priority system is needed to ensure that
time critical interrupts are not missed.  In order to not lose
the benefits of \arm{fiq} mode (ie having a lot of registers
which do not need saving; plus devices that are connected to
the \arm{fiq} lines often require very fast interrupt handling,
I think a hybrid system is required which allows \arm{fiq}
routines to execute as in the RISC OS style system, but
\arm{irq}s to be done in this new fashion.

When an interrupt occurs, it's evidently necessary to save
the state of the thread that was interrupted.  It is
probably sensible to stop the currently running thread and put
it back in the pool of runnable threads rather than returning
directly to it as a higher priority thread that has been
blocking for I/O may now be able to continue.

For an Unix-like OS such as Mach, the sensible solution
seems to be to have two halves to device drivers: one half
which is called directly when an interrupt is triggered,
which performs the time-critical work, such as copying data
from a register on the device into a buffer in ordinary RAM.

\subsection{Clocks}

Two of the \arm{irq} sources are timers.  These are loaded
with an initial value and then count down to zero at a rate
of 500ns per tick.  When it reaches zero, an interrupt occurs.
Clocks are controlled by the kernel in Mach as it must be able
to preemptively multitask threads.  One design decision which
has to be made is how fast to run the clocks --- how often to
cause interrupts to occur.  Acorn's RISC OS provides a
centisecond timer and leaves one unallocated for special uses.
Mach's clock interface allows for sophisticated control of
these clocks.  It allows for setting the resolution of the
clock in nanoseconds and for reading clocks at nanosecond
resolution.  Alarms may also be set to wake up a thread at a
given (absolute or relative) time.


\section{Structure of Mach}

Internally Mach is notionally organised into machine
dependent and machine independent parts.  However, there
is no documentation about the purpose or function of each
file.  Additionally, the platform-dependent files within
the i386 directory are split by author, not by purpose
which makes it extremely difficult to locate the file that
is required.

There are approximately 210 object files in a typical build
of gnumach\footnote{this varies depending on which device
drivers are selected, and increases if the kernel debugger
is enabled}.  53 of these are directly from the i386
directory.  Unfortunately, this is not the full story since
the non-machine specific files also include many header files
which contain machine specific data, and in some cases even
inline assembler.  Assembly code is justifiable (particularly
in kernels), but frequently there is no comment against it to
indicate what it does and it is unreasonable to expect porters
to understand 8086 assembler.

The internal layout is confused.  The original build environment
used files in a `dummy' directory to control which features were
added to the kernel, and vestiges of this system still remain.
Some work has been done to convert the kernel to a GNU-style
build environment where options are specified to a configuration
script.

One of the major reorganisations that occured between Mach 3 and
Mach 4 was that the build environment changed from the
machine-dependent parts pulling in machine-specific components
to a system where the machine-specific components treat the
machine-dependent parts as a library of functions that are used
in the cases where there is not a machine-specific function for
the job.  In theory this makes it easier to produce
machine-specific components and aids the understandability of it.
Unfortunately, it seems that this reorganisation was not completed
and there are still many components which work in the old way.

Porting the MIG (Mach Interface Generator) to the ARM is not a
hard job, it merely needs to be told about the sizes of certain
types --- for example, it needs to be told the size of a machine
word, and the size of a byte.

Mach pulls in some function from libc rather than providing its
own.  These functions are \func{htonl}, \func{ntohl}, \func{htons},
\func{ntohs}, \func{memcpy}, \func{memset}, \func{bcopy},
\func{bzero} and \func{strstr}.  The easiest way to provide these
functions is to build glibc for the appropriate target.  Since the
build environment is not set up to build for a different processor
(referred to as a cross-compilation), some manual tweaking of the
Makefile is required.


\section{Summary of Evaluation}

I was unsuccessful in my attempt to port Mach to the ARM.  This
was due to a number of factors:

\begin{itemize}
\item The internal structure of Mach is not sufficiently well
documented.
\item Mach has a lot of code in it left over from previous
incarnations.  It needs to be tidied up so it is easier to
understand.
\item I did not have access to the University computer
systems from my room, as I had been led to believe that I
would.
\item The build structure of Mach is extremely complicated.
\end{itemize}

I therefore decided to look for alternative methods of running
HURD on the ARM.