etherlabmaster: documentation/ethercat

%------------------------------------------------------------------------------
%
%  IgH EtherCAT Master Documentation
%
%  $Id$
%
%  vi: spell spelllang=en tw=78
% 
%------------------------------------------------------------------------------

\documentclass[a4paper,12pt,BCOR6mm,bibtotoc,idxtotoc]{scrbook}

\usepackage[latin1]{inputenc}
\usepackage[automark,headsepline]{scrpage2}
\usepackage{graphicx}
\usepackage{makeidx}
\usepackage[refpage]{nomencl}
\usepackage{listings}
\usepackage{svn}
\usepackage{textcomp}
\usepackage{url}
\usepackage{SIunits}
\usepackage[pdfpagelabels,plainpages=false]{hyperref}

\setlength{\parskip}{0.8ex plus 0.8ex minus 0.5ex}
\setlength{\parindent}{0mm}

\setcounter{secnumdepth}{\subsubsectionlevel}

\DeclareFontShape{OT1}{cmtt}{bx}{n}
{
  <5><6><7><8><9><10><10.95><12><14.4><17.28><20.74><24.88>cmttb10
}{}

\lstset{basicstyle=\ttfamily\small,numberstyle=\tiny,aboveskip=4mm,
  belowskip=2mm,escapechar=`}
\renewcommand\lstlistlistingname{List of Listings}

% Workaround for lstlistoflistings bug
\makeatletter% --> De-TeX-FAQ
\renewcommand*{\lstlistoflistings}{%
  \begingroup
    \if@twocolumn
      \@restonecoltrue\onecolumn
    \else
      \@restonecolfalse
    \fi
    \lol@heading
    \setlength{\parskip}{\z@}%
    \setlength{\parindent}{\z@}%
    \setlength{\parfillskip}{\z@ \@plus 1fil}%
    \@starttoc{lol}%
    \if@restonecol\twocolumn\fi
  \endgroup
}
\makeatother% --> \makeatletter

\renewcommand\nomname{Glossary}

\newcommand{\IgH}{\raisebox{-0.7667ex}
  {\includegraphics[height=2.2ex]{images/ighsign}}}

\SVN $Date$
\SVN $Revision$

\newcommand{\masterversion}{1.4.0}
\newcommand{\linenum}[1]{\normalfont\textcircled{\tiny #1}}

\makeindex
\makenomenclature

%------------------------------------------------------------------------------

\begin{document}

\pagenumbering{roman}
\pagestyle{empty}

\begin{titlepage}
  \begin{center}
    \rule{\textwidth}{1.5mm}

    {\Huge\bf IgH \includegraphics[height=2.4ex]{images/ethercat}
      Master \masterversion\\[1ex]
      Documentation}

    \vspace{1ex}
    \rule{\textwidth}{1.5mm}

    \vspace{\fill}
    {\Large Florian Pose, \url{fp@igh-essen.com}\\[1ex]
      Ingenieurgemeinschaft \IgH}

    \vspace{\fill}
    {\Large Essen, \SVNDate\\[1ex]
      Revision \SVNRevision}
  \end{center}
\end{titlepage}

%------------------------------------------------------------------------------

\tableofcontents
\listoftables
\listoffigures
\lstlistoflistings

%------------------------------------------------------------------------------

\newpage
\pagestyle{scrheadings}

\section*{Conventions}
\addcontentsline{toc}{section}{Conventions}
\markleft{Conventions}

The following typographic conventions are used:

\begin{itemize}

\item \textit{Italic face} is used for newly introduced terms and file names.

\item \texttt{Typewriter face} is used for code examples and command line
output.

\item \texttt{\textbf{Bold typewriter face}} is used for user input in command
lines.

\end{itemize}

Data values and addresses are usually specified as hexadecimal values. These
are marked in the \textit{C} programming language style with the prefix
\lstinline+0x+ (example: \lstinline+0x88A4+). Unless otherwise noted, address
values are specified as byte addresses.

Function names are always printed with parentheses, but without
parameters. So, if a function \lstinline+ecrt_request_master()+ has
empty parentheses, this shall not imply that it has no parameters.

If shell commands have to be entered, this is marked by a dollar prompt:

\begin{lstlisting}
$
\end{lstlisting}

Further, if a shell command has to be entered as the superuser, the
prompt is a mesh:

\begin{lstlisting}
#
\end{lstlisting}

%------------------------------------------------------------------------------

\chapter{The IgH EtherCAT Master}
\label{chapter:master}
\pagenumbering{arabic}

This chapter covers some general information about the EtherCAT master.

%------------------------------------------------------------------------------

\section{Feature Summary}
\label{sec:summary}
\index{Master!Features}

The list below gives a short summary of the master features.

\begin{itemize}

\item Designed as a kernel module for Linux 2.6.

\item Implemented according to IEC 61158-12 \cite{dlspec} \cite{alspec}.

\item Comes with EtherCAT-capable drivers for several common Ethernet devices.

  \begin{itemize}

  \item The Ethernet hardware is operated without interrupts.

  \item Drivers for additional Ethernet hardware can easily be implemented
  using the common device interface (see section~\ref{sec:ecdev}) provided by
  the master module.

  \end{itemize}

\item The master module supports multiple EtherCAT masters running in
parallel.

\item The master code supports any Linux realtime extension through its
independent architecture.

  \begin{itemize}

  \item RTAI\nomenclature{RTAI}{Realtime Application Interface},
  ADEOS\nomenclature{ADEOS}{Adaptive Domain Environment for Operating
  Systems}, etc.

  \item It runs well even without realtime extensions.

  \end{itemize}

\item Common ``realtime interface'' for applications, that want to use
EtherCAT functionality (see section~\ref{sec:ecrt}).

\item \textit{Domains} are introduced, to allow grouping of process
  data transfers with different slave groups and task periods.

  \begin{itemize}

  \item Handling of multiple domains with different task periods.

  \item Automatic calculation of process data mapping, FMMU and sync manager
  configuration within each domain.

  \end{itemize}

\item Communication through several finite state machines.

  \begin{itemize}

  \item Automatic bus scanning after topology changes.

  \item Bus monitoring during operation.

  \item Automatic reconfiguration of slaves (for example after power failure)
  during operation.

  \end{itemize}

\item CANopen-over-EtherCAT (CoE)

  \begin{itemize}

  \item Sdo upload, download and information service.

  \item Slave configuration via Sdos.

  \item Sdo access from user-space and from the application.

  \end{itemize}

\item Ethernet-over-EtherCAT (EoE)

  \begin{itemize}

  \item Transparent use of EoE slaves via virtual network interfaces.

  \item Natively supports either a switched or a routed EoE network
  architecture.

  \end{itemize}

\item User space command-line-tool ``ethercat`` (see
section~\ref{sec:ethercat})

  \begin{itemize}

  \item Showing the current bus with slaves, Pdos and Sdos.
  \item Showing the bus configuration.
  \item Showing domains and process data.
  \item Setting the master's debug level.
  \item Writing alias addresses.
  \item Sdo uploading/downloading.
  \item Reading/writing a slave's SII.
  \item Setting slave states.
  \item Generate slave description XML.

  \end{itemize}

\item Seamless system integration though LSB\nomenclature{LSB}{Linux
    Standard Base} compliance.

  \begin{itemize}

  \item Master and network device configuration via sysconfig files.

  \item Init script for master control.

  \end{itemize}

\item Virtual read-only network interface for monitoring and debugging
  purposes.

\end{itemize}

%------------------------------------------------------------------------------

\section{License}
\label{sec:license}

The master code is released under the terms and conditions of the GNU
General Public License\index{GPL} \cite{gpl} (version 2). Other
developers, that want to use EtherCAT with Linux systems, are invited
to use the master code or even participate on development.

%------------------------------------------------------------------------------

\chapter{Architecture}
\label{sec:arch}
\index{Master!Architecture}

The EtherCAT master is integrated into the Linux 2.6 kernel. This was
an early design decision, which has been made for several reasons:

\begin{itemize}

\item Kernel code has significantly better realtime characteristics, i.~e.
less latency than user space code. It was foreseeable, that a fieldbus master
has a lot of cyclic work to do. Cyclic work is usually triggered by timer
interrupts inside the kernel. The execution delay of a function that processes
timer interrupts is less, when it resides in kernel space, because there is no
need of time-consuming context switches to a user space process.

\item It was also foreseeable, that the master code has to directly
communicate with the Ethernet hardware. This has to be done in the kernel
anyway (through network device drivers), which is one more reason for the
master code being in kernel space.

\end{itemize}

Figure~\ref{fig:arch} gives a general overview of the master architecture.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.9\textwidth]{images/architecture}
  \caption{Master architecture}
  \label{fig:arch}
\end{figure}

\paragraph{Master Module}
\index{Master module}

Kernel module containing one or more EtherCAT master instances (see
section~\ref{sec:mastermod}), the ``Device Interface'' (see
section~\ref{sec:ecdev}) and the ``Realtime Interface'' (see
section~\ref{sec:ecrt}).

\paragraph{Device Modules}
\index{Device modules}

EtherCAT-capable Ethernet device driver modules\index{Device modules}, that
offer their devices to the EtherCAT master via the device interface (see
section~\ref{sec:ecdev}). These modified network drivers can handle network
devices used for EtherCAT operation and ``normal'' Ethernet devices in
parallel. A master can accept a certain device and then is able to send and
receive EtherCAT frames. Ethernet devices declined by the master module are
connected to the kernel's network stack as usual.

\paragraph{Application Modules}
\index{Application module}

Kernel modules, that use the EtherCAT master (usually for cyclic exchange of
process data with EtherCAT slaves). These modules are not part of the EtherCAT
master code\footnote{Although there are some examples provided in the
\textit{examples} directory, see chapter~\ref{chapter:examples}}, but have to
be generated or written by the user. An application module can ``request'' a
master through the realtime interface (see section~\ref{sec:ecrt}). If this
succeeds, the module has the control over the master: It can provide a bus
configuration and exchange process data.

%------------------------------------------------------------------------------

\section{Phases}
\index{Master phases}

The EtherCAT master runs through several phases (see fig.~\ref{fig:phases}):

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.9\textwidth]{images/phases}
  \caption{Master phases and transitions}
  \label{fig:phases}
\end{figure}
\begin{description}

\item[Orphaned phase]\index{Orphaned phase} This mode takes effect, when the
master still waits for its Ethernet device to connect. No bus communication is
possible until then.

\item[Idle phase]\index{Idle phase} takes effect when the master has accepted
an Ethernet device, but is not requested by any application yet. The master
runs its state machine (see section~\ref{sec:fsm-master}), that automatically
scans the bus for slaves and executes pending operations from the user space
interface (for example Sdo access). The command-line tool can be used to access
the bus, but there is no process data exchange because of the missing bus
configuration.

\item[Operation phase]\index{Operation phase} The master is requested by an
application that can provide a bus configuration and exchange process data.

\end{description}

%------------------------------------------------------------------------------

\section{General Behavior} % FIXME
\index{Master behavior}

\ldots

%   Behavior (Scanning) TODO

%------------------------------------------------------------------------------

\section{Master Module}
\label{sec:mastermod}
\index{Master module}

The EtherCAT master kernel module \textit{ec\_master} can contain multiple
master instances. Each master waits for a certain Ethernet device identified
by its MAC address\index{MAC address}. These addresses have to be specified on
module loading via the \textit{main\_devices} module parameter. The number of
master instances to initialize is taken from the number of MAC addresses
given.

The below command loads the master module with a single master instance that
waits for the Ethernet device with the MAC address
\lstinline+00:0E:0C:DA:A2:20+. The master will be accessible via index $0$.

\begin{lstlisting}
# `\textbf{modprobe ec\_master main\_devices=00:0E:0C:DA:A2:20}`
\end{lstlisting}

MAC addresses for multiple masters have to be separated by commas:

\begin{lstlisting}
# `\textbf{modprobe ec\_master main\_devices=00:0E:0C:DA:A2:20,00:e0:81:71:d5:1c}`
\end{lstlisting}

The two masters can be addressed by their indices 0 and 1 respectively (see
figure~\ref{fig:masters}). The master index is needed for the
\lstinline+ecrt_master_request()+ function of the realtime interface (see
section~\ref{sec:ecrt}) and the \lstinline+--master+ option of the
\textit{ethercat} command-line tool (see section~\ref{sec:ethercat}), which
defaults to $0$.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.5\textwidth]{images/masters}
  \caption{Multiple masters in one module}
  \label{fig:masters}
\end{figure}

\paragraph{Init script}
\index{Init script}

Most probably you won't want to load the master module and the Ethernet driver
modules manually, but start the master as a service. See
section~\ref{sec:system} on how to do this.

\paragraph{Syslog}

The master module outputs information about it's state and events to the
kernel ring buffer. These also end up in the system logs.  The above module
loading command should result in the messages below:

\begin{lstlisting}
# `\textbf{dmesg | tail -2}`
EtherCAT: Master driver `\masterversion`
EtherCAT: 2 masters waiting for devices.

# `\textbf{tail -2 /var/log/messages}`
Jul  4 10:22:45 ethercat kernel: EtherCAT: Master driver `\masterversion`
Jul  4 10:22:45 ethercat kernel: EtherCAT: 2 masters waiting
                                 for devices.
\end{lstlisting}

All EtherCAT master output is prefixed with \lstinline+EtherCAT+ which makes
searching the logs easier.

%------------------------------------------------------------------------------

\section{Handling of Process Data} % FIXME
\label{sec:processdata}

\ldots

\paragraph{Process Data Image}
\index{Process data}

The slaves offer their inputs and outputs by presenting the master so-called
``Process Data Objects'' (Pdos\index{Pdo}). The available Pdos can be
determined by reading out the slave's TXPDO and RXPDO E$^2$PROM categories. The
application can register the Pdos for data exchange during cyclic operation.
The sum of all registered Pdos defines the ``process data image'', which is
exchanged via the ``Logical ReadWrite'' datagrams introduced
in~\cite[section~5.4.2.4]{dlspec}.

\paragraph{Process Data Domains}
\index{Domain}

The process data image can be easily managed by creating so-called
``domains'', which group Pdos and allocate the datagrams needed to
exchange them. Domains are mandatory for process data exchange, so
there has to be at least one. They were introduced for the following
reasons:

\begin{itemize}
\item The maximum size of a ``Logical ReadWrite'' datagram is limited
  due to the limited size of an Ethernet frame: The maximum data size
  is the Ethernet data field size minus the EtherCAT frame header,
  EtherCAT datagram header and EtherCAT datagram footer: $1500 - 2 -
  12 - 2 = 1484$ octets. If the size of the process data image exceeds
  this limit, multiple frames have to be sent, and the image has to be
  partitioned for the use of multiple datagrams. A domain manages this
  automatically.
\item Not every Pdo has to be exchanged with the same frequency: The
  values of Pdos can vary slowly over time (for example temperature
  values), so exchanging them with a high frequency would just waste
  bus bandwidth. For this reason, multiple domains can be created, to
  group different Pdos and so allow separate exchange.
\end{itemize}

There is no upper limit for the number of domains, but each domain
occupies one FMMU in each slave involved, so the maximum number of
domains is also limited by the slaves' capabilities.

\paragraph{FMMU Configuration}
\index{FMMU!Configuration}

An application can register Pdos for process data exchange. Every
Pdo is part of a memory area in the slave's physical memory, that is
protected by a sync manager \cite[section~6.7]{dlspec} for
synchronized access. In order to make a sync manager react on a
datagram accessing its memory, it is necessary to access the last byte
covered by the sync manager. Otherwise the sync manager will not react
on the datagram and no data will be exchanged. That is why the whole
synchronized memory area has to be included into the process data
image: For example, if a certain Pdo of a slave is registered for
exchange with a certain domain, one FMMU will be configured to map the
complete sync-manager-protected memory, the Pdo resides in. If a
second Pdo of the same slave is registered for process data exchange
within the same domain, and this Pdo resides in the same
sync-manager-protected memory as the first Pdo, the FMMU configuration
is not touched, because the appropriate memory is already part of the
domain's process data image.  If the second Pdo belongs to another
sync-manager-protected area, this complete area is also included into
the domains process data image. See figure~\ref{fig:fmmus} for an
overview, how FMMU's are configured to map physical memory to logical
process data images.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=\textwidth]{images/fmmus}
  \caption{FMMU configuration for several domains}
  \label{fig:fmmus}
\end{figure}

\paragraph{Process Data Pointers} % FIXME

The figure also demonstrates the way, the application can access the exchanged
process data: At Pdo registration, the application has to provide the address
of a process data pointer. Upon calculation of the domain image and allocation
of process data memory, this pointer is redirected to the appropriate location
inside the domain's process data memory and can later be easily dereferenced by
the module code.

%------------------------------------------------------------------------------

\chapter{Application Interface}
\label{sec:ecrt}
\index{Application interface}

%   Interface version
%   Master Requesting and Releasing
%   Master Locking
%   Slave configuration
%   Configuring Pdo assignment and mapping
%   Domains (memory)
%   Pdo entry registration
%   Sdo configuration
%   Sdo access
%   Cyclic operation

The application interface provides functions and data structures for
applications to access and use an EtherCAT master. The complete documentation
of the interface is included as Doxygen~\cite{doxygen} comments in the header
file \textit{include/ecrt.h}. You can either directly view the file comments
or generate an HTML documentation as described in section~\ref{sec:gendoc}.

The following sections cover a general description of the application
interface.

Every application should use the master in two steps:

\begin{description}

\item[Configuration] The master is requested and the configuration is applied.
Domains are created Slaves are configured and Pdo entries are registered (see
section~\ref{sec:masterconfig}).

\item[Operation] Cyclic code is run, process data is exchanged (see
section~\ref{sec:cyclic}).

\end{description}

%------------------------------------------------------------------------------

\section{Master Configuration}
\label{sec:masterconfig}

\ldots

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.8\textwidth]{images/app-config}
  \caption{Master Configuration}
  \label{fig:app-config}
\end{figure}

%------------------------------------------------------------------------------

\section{Cyclic Operation}
\label{sec:cyclic}

\ldots
% FIXME PDOS endianess


%------------------------------------------------------------------------------

\section{Concurrent Master Access} % FIXME
\label{sec:concurr}
\index{Concurrency}

In some cases, one master is used by several instances, for example when an
application does cyclic process data exchange, and there are EoE-capable slaves
that require to exchange Ethernet data with the kernel (see
section~\ref{sec:eoeimp}). For this reason, the master is a shared resource,
and access to it has to be sequentialized. This is usually done by locking with
semaphores, or other methods to protect critical sections.

The master itself can not provide locking mechanisms, because it has no chance
to know the appropriate kind of lock. For example if the application uses RTAI
functionality, ordinary kernel semaphores would not be sufficient. For that, an
important design decision was made: The application that reserved a master must
have the total control, therefore it has to take responsibility for providing
the appropriate locking mechanisms. If another instance wants to access the
master, it has to request the master lock by callbacks, that have to be set by
the application. Moreover the application can deny access to the master if it
considers it to be awkward at the moment.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.6\textwidth]{images/master-locks}
  \caption{Concurrent master access}
  \label{fig:locks}
\end{figure}

Figure~\ref{fig:locks} exemplary shows, how two processes share one master: The
application's cyclic task uses the master for process data exchange, while the
master-internal EoE process uses it to communicate with EoE-capable slaves.
Both have to acquire the master lock before access: The application task can
access the lock natively, while the EoE process has to use the callbacks.
Section~\ref{sec:concurrency} gives an example, of how to implement this.

%------------------------------------------------------------------------------

\chapter{Ethernet Devices}
\label{sec:devices}

The EtherCAT protocol is based on the Ethernet standard, so a master relies on
standard Ethernet hardware to communicate with the bus.

The term \textit{device} is used as a synonym for Ethernet network interface
hardware. There are device driver modules that handle Ethernet hardware, which
a master can use to connect to an EtherCAT bus.

%------------------------------------------------------------------------------

\section{Network Driver Basics}
\label{sec:networkdrivers}
\index{Network drivers}

EtherCAT relies on Ethernet hardware and the master needs a physical
Ethernet device to communicate with the bus. Therefore it is necessary
to understand how Linux handles network devices and their drivers,
respectively.

\paragraph{Tasks of a Network Driver}

Network device drivers usually handle the lower two layers of the OSI model,
that is the physical layer and the data-link layer. A network device itself
natively handles the physical layer issues: It represents the hardware to
connect to the medium and to send and receive data in the way, the physical
layer protocol describes. The network device driver is responsible for getting
data from the kernel's networking stack and forwarding it to the hardware,
that does the physical transmission.  If data is received by the hardware
respectively, the driver is notified (usually by means of an interrupt) and
has to read the data from the hardware memory and forward it to the network
stack. There are a few more tasks, a network device driver has to handle,
including queue control, statistics and device dependent features.

\paragraph{Driver Startup}

Usually, a driver searches for compatible devices on module loading.
For PCI drivers, this is done by scanning the PCI bus and checking for
known device IDs. If a device is found, data structures are allocated
and the device is taken into operation.

\paragraph{Interrupt Operation}
\index{Interrupt}

A network device usually provides a hardware interrupt that is used to
notify the driver of received frames and success of transmission, or
errors, respectively. The driver has to register an interrupt service
routine (ISR\index{ISR}\nomenclature{ISR}{Interrupt Service Routine}),
that is executed each time, the hardware signals such an event. If the
interrupt was thrown by the own device (multiple devices can share one
hardware interrupt), the reason for the interrupt has to be determined
by reading the device's interrupt register. For example, if the flag
for received frames is set, frame data has to be copied from hardware
to kernel memory and passed to the network stack.

\paragraph{The \lstinline+net_device+ Structure}
\index{net\_device}

The driver registers a \lstinline+net_device+ structure for each device to
communicate with the network stack and to create a ``network interface''. In
case of an Ethernet driver, this interface appears as \textit{ethX}, where X is
a number assigned by the kernel on registration. The \lstinline+net_device+
structure receives events (either from user space or from the network stack)
via several callbacks, which have to be set before registration. Not every
callback is mandatory, but for reasonable operation the ones below are needed
in any case:

\newsavebox\boxopen
\sbox\boxopen{\lstinline+open()+}
\newsavebox\boxstop
\sbox\boxstop{\lstinline+stop()+}
\newsavebox\boxxmit
\sbox\boxxmit{\lstinline+hard_start_xmit()+}
\newsavebox\boxstats
\sbox\boxstats{\lstinline+get_stats()+}

\begin{description}

\item[\usebox\boxopen] This function is called when network communication has
to be started, for example after a command \lstinline+ip link set ethX up+ from
user space. Frame reception has to be enabled by the driver.

\item[\usebox\boxstop] The purpose of this callback is to ``close'' the device,
i.~e.  make the hardware stop receiving frames.

\item[\usebox\boxxmit] This function is called for each frame that has to be
transmitted. The network stack passes the frame as a pointer to an
\lstinline+sk_buff+ structure (``socket buffer''\index{Socket buffer}, see
below), which has to be freed after sending.

\item[\usebox\boxstats] This call has to return a pointer to the device's
\lstinline+net_device_stats+ structure, which permanently has to be filled with
frame statistics. This means, that every time a frame is received, sent, or an
error happened, the appropriate counter in this structure has to be increased.

\end{description}

The actual registration is done with the \lstinline+register_netdev()+ call,
unregistering is done with \lstinline+unregister_netdev()+.

\paragraph{The \lstinline+netif+ Interface}
\index{netif}

All other communication in the direction interface $\to$ network stack is done
via the \lstinline+netif_*()+ calls. For example, on successful device opening,
the network stack has to be notified, that it can now pass frames to the
interface. This is done by calling \lstinline+netif_start_queue()+. After this
call, the \lstinline+hard_start_xmit()+ callback can be called by the network
stack. Furthermore a network driver usually manages a frame transmission queue.
If this gets filled up, the network stack has to be told to stop passing
further frames for a while. This happens with a call to
\lstinline+netif_stop_queue()+. If some frames have been sent, and there is
enough space again to queue new frames, this can be notified with
\lstinline+netif_wake_queue()+. Another important call is
\lstinline+netif_receive_skb()+\footnote{This function is part of the NAPI
(``New API''), that replaces the kernel 2.4 technique for interfacing to the
network stack (with \lstinline+netif_rx()+). NAPI is a technique to improve
network performance on Linux. Read more in
\url{http://www.cyberus.ca/~hadi/usenix-paper.tgz}.}: It passes a frame to the
network stack, that was just received by the device. Frame data has to be
included in a so-called ``socket buffer'' for that (see below).

\paragraph{Socket Buffers}
\index{Socket buffer}

Socket buffers are the basic data type for the whole network stack. They serve
as containers for network data and are able to quickly add data headers and
footers, or strip them off again. Therefore a socket buffer consists of an
allocated buffer and several pointers that mark beginning of the buffer
(\lstinline+head+), beginning of data (\lstinline+data+), end of data
(\lstinline+tail+) and end of buffer (\lstinline+end+). In addition, a socket
buffer holds network header information and (in case of received data) a
pointer to the \lstinline+net_device+, it was received on. There exist
functions that create a socket buffer (\lstinline+dev_alloc_skb()+), add data
either from front (\lstinline+skb_push()+) or back (\lstinline+skb_put()+),
remove data from front (\lstinline+skb_pull()+) or back
(\lstinline+skb_trim()+), or delete the buffer (\lstinline+kfree_skb()+).  A
socket buffer is passed from layer to layer, and is freed by the layer that
uses it the last time. In case of sending, freeing has to be done by the
network driver.

%------------------------------------------------------------------------------

\section{EtherCAT Device Drivers}
\label{sec:ethercatdrivers}

There are a few requirements for Ethernet network devices to function as
EtherCAT devices, when connected to an EtherCAT bus.

\paragraph{Dedicated Interfaces}

For performance and realtime purposes, the EtherCAT master needs direct and
exclusive access to the Ethernet hardware. This implies that the network device
must not be connected to the kernel's network stack as usual, because the
kernel would try to use it as an ordinary Ethernet device.

\paragraph{Interrupt-less Operation}
\index{Interrupt}

EtherCAT frames travel through the logical EtherCAT ring and are then sent back
to the master. Communication is highly deterministic: A frame is sent and will
be received again after a constant time, so there is no need to notify the
driver about frame reception: The master can instead query the hardware for
received frames, if it expects them to be already received.

Figure~\ref{fig:interrupt} shows two workflows for cyclic frame transmission
and reception with and without interrupts.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.9\textwidth]{images/interrupt}
  \caption{Interrupt Operation versus Interrupt-less Operation}
  \label{fig:interrupt}
\end{figure}

In the left workflow ``Interrupt Operation'', the data from the last cycle is
first processed and a new frame is assembled with new datagrams, which is then
sent.  The cyclic work is done for now.  Later, when the frame is received
again by the hardware, an interrupt is triggered and the ISR is executed. The
ISR will fetch the frame data from the hardware and initiate the frame
dissection: The datagrams will be processed, so that the data is ready for
processing in the next cycle.

In the right workflow ``Interrupt-less Operation'', there is no hardware
interrupt enabled.  Instead, the hardware will be polled by the master by
executing the ISR. If the frame has been received in the meantime, it will be
dissected. The situation is now the same as at the beginning of the left
workflow: The received data is processed and a new frame is assembled and
sent. There is nothing to do for the rest of the cycle.

The interrupt-less operation is desirable, because hardware interrupts are not
conducive in improving the driver's realtime behaviour: Their indeterministic
incidences contribute to increasing the jitter. Besides, if a realtime
extension (like RTAI) is used, some additional effort would have to be made to
prioritize interrupts.

\paragraph{Ethernet and EtherCAT Devices}

Another issue lies in the way Linux handles devices of the same type.  For
example, a PCI\nomenclature{PCI}{Peripheral Component Interconnect, Computer
Bus} driver scans the PCI bus for devices it can handle. Then it registers
itself as the responsible driver for all of the devices found. The problem is,
that an unmodified driver can not be told to ignore a device because it will
be used for EtherCAT later. There must be a way to handle multiple devices of
the same type, where one is reserved for EtherCAT, while the other is treated
as an ordinary Ethernet device.

For all this reasons, the author decided that the only acceptable solution is
to modify standard Ethernet drivers in a way that they keep their normal
functionality, but gain the ability to treat one or more of the devices as
EtherCAT-capable.

Below are the advantages of this solution:

\begin{itemize}
\item No need to tell the standard drivers to ignore certain devices.
\item One networking driver for EtherCAT and non-EtherCAT devices.
\item No need to implement a network driver from scratch and running
  into issues, the former developers already solved.
\end{itemize}

The chosen approach has the following disadvantages:

\begin{itemize}
\item The modified driver gets more complicated, as it must handle
  EtherCAT and non-EtherCAT devices.
\item Many additional case differentiations in the driver code.
\item Changes and bug fixes on the standard drivers have to be ported
  to the Ether\-CAT-capable versions from time to time.
\end{itemize}

%------------------------------------------------------------------------------

\section{Device Selection}
\label{sec:deviceselection}

After loading the master module, at least one EtherCAT-capable network driver
module has to be loaded, that offers its devices to the master (see
section~\ref{sec:ecdev}. The master module knows the devices to choose from the
module parameters (see section~\ref{sec:mastermod}). If the init script is used
to start the master, the drivers and devices to use can be specified in the
sysconfig file (see section~\ref{sec:sysconfig}).

%------------------------------------------------------------------------------

\section{EtherCAT Device Interface}
\label{sec:ecdev}
\index{Device interface}

An anticipation to the section about the master module
(section~\ref{sec:mastermod}) has to be made in order to understand
the way, a network device driver module can connect a device to a
specific EtherCAT master.

The master module provides a ``device interface'' for network device drivers.
To use this interface, a network device driver module must include the header
\textit{devices/ecdev.h}\nomenclature{ecdev}{EtherCAT Device}, coming with the
EtherCAT master code. This header offers a function interface for EtherCAT
devices. All functions of the device interface are named with the prefix
\lstinline+ecdev+.

The documentation of the device interface can be found in the header file or in
the appropriate module of the interface documentation (see
section~\ref{sec:gendoc} for generation instructions).

\ldots % FIXME general description of the device interface

%------------------------------------------------------------------------------

\section{Patching Network Drivers}
\label{sec:patching}
\index{Network drivers}

This section will describe, how to make a standard Ethernet driver
EtherCAT-capable. Unfortunately, there is no standard procedure to enable an
Ethernet driver for use with the EtherCAT master, but there are a few common
techniques.

\begin{enumerate}

\item A first simple rule is, that \lstinline+netif_*()+ calls must be avoided
for all EtherCAT devices. As mentioned before, EtherCAT devices have no
connection to the network stack, and therefore must not call its interface
functions.

\item Another important thing is, that EtherCAT devices should be operated
without interrupts. So any calls of registering interrupt handlers and enabling
interrupts at hardware level must be avoided, too.

\item The master does not use a new socket buffer for each send operation:
Instead there is a fix one allocated on master initialization. This socket
buffer is filled with an EtherCAT frame with every send operation and passed to
the \lstinline+hard_start_xmit()+ callback. For that it is necessary, that the
socket buffer is not be freed by the network driver as usual.

\end{enumerate}

An Ethernet driver usually handles several Ethernet devices, each described by
a \lstinline+net_device+ structure with a \lstinline+priv_data+ field to
attach driver-dependent data to the structure. To distinguish between normal
Ethernet devices and the ones used by EtherCAT masters, the private data
structure used by the driver could be extended by a pointer, that points to an
\lstinline+ec_device_t+ object returned by \lstinline+ecdev_offer()+ (see
section~\ref{sec:ecdev}) if the device is used by a master and otherwise is
zero.

The RealTek RTL-8139 Fast Ethernet driver is a ``simple'' Ethernet driver and
can be taken as an example to patch new drivers. The interesting sections can
be found by searching the string ``ecdev" in the file
\textit{devices/8139too-2.6.24-ethercat.c}.

%------------------------------------------------------------------------------

\chapter{State Machines}
\label{sec:fsm}
\index{FSM}

Many parts of the EtherCAT master are implemented as \textit{finite state
machines} (FSMs\nomenclature{FSM}{Finite State Machine}). Though this leads
to a higher grade of complexity in some aspects, is opens many new
possibilities.

The below short code example exemplary shows how to read all slave
states and moreover illustrates the restrictions of ``sequential''
coding:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  ec_datagram_brd(datagram, 0x0130, 2); // prepare datagram
  if (ec_master_simple_io(master, datagram)) return -1;
  slave_states = EC_READ_U8(datagram->data); // process datagram
\end{lstlisting}

The \textit{ec\_master\_simple\_io()} function provides a simple interface for
synchronously sending a single datagram and receiving the result\footnote{For
all communication issues have been meanwhile sourced out into state machines,
the function is deprecated and stopped existing. Nevertheless it is adequate
for showing it's own restrictions.}. Internally, it queues the specified
datagram, invokes the \textit{ec\_master\_send\_datagrams()} function to send
a frame with the queued datagram and then waits actively for its reception.

This sequential approach is very simple, reflecting in only three
lines of code. The disadvantage is, that the master is blocked for the
time it waits for datagram reception. There is no difficulty when only
one instance is using the master, but if more instances want to
(synchronously\footnote{At this time, synchronous master access will
  be adequate to show the advantages of an FSM. The asynchronous
  approach will be discussed in section~\ref{sec:eoeimp}}) use the
master, it is inevitable to think about an alternative to the
sequential model.

Master access has to be sequentialized for more than one instance
wanting to send and receive datagrams synchronously. With the present
approach, this would result in having one phase of active waiting for
each instance, which would be non-acceptable especially in realtime
circumstances, because of the huge time overhead.

A possible solution is, that all instances would be executed
sequentially to queue their datagrams, then give the control to the
next instance instead of waiting for the datagram reception. Finally,
bus IO is done by a higher instance, which means that all queued
datagrams are sent and received. The next step is to execute all
instances again, which then process their received datagrams and issue
new ones.

This approach results in all instances having to retain their state,
when giving the control back to the higher instance. It is quite
obvious to use a \textit{finite state machine} model in this case.
Section~\ref{sec:fsmtheory} will introduce some of the theory used,
while the listings below show the basic approach by coding the example
from above as a state machine:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  // state 1
  ec_datagram_brd(datagram, 0x0130, 2); // prepare datagram
  ec_master_queue(master, datagram); // queue datagram
  next_state = state_2;
  // state processing finished
\end{lstlisting}

After all instances executed their current state and queued their
datagrams, these are sent and received. Then the respective next
states are executed:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  // state 2
  if (datagram->state != EC_DGRAM_STATE_RECEIVED) {
          next_state = state_error;
          return; // state processing finished
  }
  slave_states = EC_READ_U8(datagram->data); // process datagram
  // state processing finished.
\end{lstlisting}

See section~\ref{sec:statemodel} for an introduction to the
state machine programming concept used in the master code.

%------------------------------------------------------------------------------

\section{State Machine Theory}
\label{sec:fsmtheory}
\index{FSM!Theory}

A finite state machine \cite{automata} is a model of behavior with
inputs and outputs, where the outputs not only depend on the inputs,
but the history of inputs. The mathematical definition of a finite
state machine (or finite automaton) is a six-tuple $(\Sigma, \Gamma,
S, s_0, \delta, \omega)$, with

\begin{itemize}
\item the input alphabet $\Sigma$, with $\Sigma \neq
  \emptyset$, containing all input symbols,
\item the output alphabet $\Gamma$, with $\Gamma \neq
  \emptyset$, containing all output symbols,
\item the set of states $S$, with $S \neq \emptyset$,
\item the set of initial states $s_0$ with $s_0 \subseteq S, s_0 \neq
  \emptyset$
\item the transition function $\delta: S \times \Sigma \rightarrow S
  \times \Gamma$
\item the output function $\omega$.
\end{itemize}

The state transition function $\delta$ is often specified by a
\textit{state transition table}, or by a \textit{state transition
  diagram}. The transition table offers a matrix view of the state
machine behavior (see table~\ref{tab:statetrans}). The matrix rows
correspond to the states ($S = \{s_0, s_1, s_2\}$) and the columns
correspond to the input symbols ($\Gamma = \{a, b, \varepsilon\}$).
The table contents in a certain row $i$ and column $j$ then represent
the next state (and possibly the output) for the case, that a certain
input symbol $\sigma_j$ is read in the state $s_i$.

\begin{table}[htbp]
  \caption{A typical state transition table}
  \label{tab:statetrans}
  \vspace{2mm}
  \centering
  \begin{tabular}{l|ccc}
    & $a$ & $b$ & $\varepsilon$\\ \hline
    $s_0$ & $s_1$ & $s_1$ & $s_2$\\
    $s_1$ & $s_2$ & $s_1$ & $s_0$\\
    $s_2$ & $s_0$ & $s_0$ & $s_0$\\ \hline
  \end{tabular}
\end{table}

The state diagram for the same example looks like the one in
figure~\ref{fig:statetrans}. The states are represented as circles or
ellipses and the transitions are drawn as arrows between them. Close
to a transition arrow can be the condition that must be fulfilled to
allow the transition. The initial state is marked by a filled black
circle with an arrow pointing to the respective state.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.5\textwidth]{images/statetrans}
  \caption{A typical state transition diagram}
  \label{fig:statetrans}
\end{figure}

\paragraph{Deterministic and non-deterministic state machines}

A state machine can be deterministic, meaning that for one state and
input, there is one (and only one) following state. In this case, the
state machine has exactly one starting state. Non-deterministic state
machines can have more than one transitions for a single state-input
combination. There is a set of starting states in the latter case.

\paragraph{Moore and Mealy machines}

There is a distinction between so-called \textit{Moore machines}, and
\textit{Mealy machines}. Mathematically spoken, the distinction lies
in the output function $\omega$: If it only depends on the current
state ($\omega: S \rightarrow \Gamma$), the machine corresponds to the
``Moore Model''. Otherwise, if $\omega$ is a function of a state and
the input alphabet ($\omega: S \times \Sigma \rightarrow \Gamma$) the
state machine corresponds to the ``Mealy model''. Mealy machines are
the more practical solution in most cases, because their design allows
machines with a minimum number of states. In practice, a mixture of
both models is often used.

\paragraph{Misunderstandings about state machines}

There is a phenomenon called ``state explosion'', that is often taken as a
counter-argument against general use of state machines in complex environments.
It has to be mentioned, that this point is misleading~\cite{fsmmis}. State
explosions happen usually as a result of a bad state machine design: Common
mistakes are storing the present values of all inputs in a state, or not
dividing a complex state machine into simpler sub state machines. The EtherCAT
master uses several state machines, that are executed hierarchically and so
serve as sub state machines. These are also described below.

%------------------------------------------------------------------------------

\section{The Master's State Model}
\label{sec:statemodel}

This section will introduce the techniques used in the master to
implement state machines.

\paragraph{State Machine Programming}

There are certain ways to implement a state machine in \textit{C}
code. An obvious way is to implement the different states and actions
by one big case differentiation:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  enum {STATE_1, STATE_2, STATE_3};
  int state = STATE_1;

  void state_machine_run(void *priv_data) {
          switch (state) {
                  case STATE_1:
                          action_1();
                          state = STATE_2;
                          break;
                  case STATE_2:
                          action_2()
                          if (some_condition) state = STATE_1;
                          else state = STATE_3;
                          break;
                  case STATE_3:
                          action_3();
                          state = STATE_1;
                          break;
          }
  }
\end{lstlisting}

For small state machines, this is an option. The disadvantage is, that
with an increasing number of states the code soon gets complex and an
additional case differentiation is executed each run. Besides, lots of
indentation is wasted.

The method used in the master is to implement every state in an own
function and to store the current state function with a function
pointer:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  void (*state)(void *) = state1;

  void state_machine_run(void *priv_data) {
          state(priv_data);
  }

  void state1(void *priv_data) {
          action_1();
          state = state2;
  }

  void state2(void *priv_data) {
          action_2();
          if (some_condition) state = state1;
          else state = state2;
  }

  void state3(void *priv_data) {
          action_3();
          state = state1;
  }
\end{lstlisting}

In the master code, state pointers of all state machines\footnote{All except
for the EoE state machine, because multiple EoE slaves have to be handled in
parallel. For this reason each EoE handler object has its own state pointer.}
are gathered in a single object of the \lstinline+ec_fsm_master_t+ class. This
is advantageous, because there is always one instance of every state machine
available and can be started on demand.

\paragraph{Mealy and Moore}

If a closer look is taken to the above listing, it can be seen that the
actions executed (the ``outputs'' of the state machine) only depend on the
current state. This accords to the ``Moore'' model introduced in
section~\ref{sec:fsmtheory}. As mentioned, the ``Mealy'' model offers a higher
flexibility, which can be seen in the listing below:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  void state7(void *priv_data) {
          if (some_condition) {
                  action_7a();
                  state = state1;
          }
          else {
                  action_7b();
                  state = state8;
          }
  }
\end{lstlisting}

\begin{description}

\item[\linenum{3} + \linenum{7}] The state function executes the actions
depending on the state transition, that is about to be done.

\end{description}

The most flexible alternative is to execute certain actions depending
on the state, followed by some actions dependent on the state
transition:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  void state9(void *priv_data) {
          action_9();
          if (some_condition) {
                  action_9a();
                  state = state7;
          }
          else {
                  action_9b();
                  state = state10;
          }
  }
\end{lstlisting}

This model is often used in the master. It combines the best aspects of both
approaches.

\paragraph{Using Sub State Machines}

To avoid having too much states, certain functions of the EtherCAT master
state machine have been sourced out into sub state machines.  This helps to
encapsulate the related workflows and moreover avoids the ``state explosion''
phenomenon described in section~\ref{sec:fsmtheory}. If the master would
instead use one big state machine, the number of states would be a multiple of
the actual number. This would increase the level of complexity to a
non-manageable grade.

\paragraph{Executing Sub State Machines}

If a state machine starts to execute a sub state machine, it usually
remains in one state until the sub state machine terminates. This is
usually done like in the listing below, which is taken out of the
slave configuration state machine code:

\begin{lstlisting}[gobble=2,language=C,numbers=left]
  void ec_fsm_slaveconf_safeop(ec_fsm_t *fsm)
  {
          fsm->change_state(fsm); // execute state change
                                  // sub state machine

          if (fsm->change_state == ec_fsm_error) {
                  fsm->slave_state = ec_fsm_end;
                  return;
          }

          if (fsm->change_state != ec_fsm_end) return;

          // continue state processing
          ...
\end{lstlisting}

\begin{description}

\item[\linenum{3}] \lstinline+change_state+ is the state pointer of the state
change state machine. The state function, the pointer points on, is
executed\ldots

\item[\linenum{6}] \ldots either until the state machine terminates with the
error state \ldots

\item[\linenum{11}] \ldots or until the state machine terminates in the end
state. Until then, the ``higher'' state machine remains in the current state
and executes the sub state machine again in the next cycle.

\end{description}

\paragraph{State Machine Descriptions}

The below sections describe every state machine used in the EtherCAT master.
The textual descriptions of the state machines contain references to the
transitions in the corresponding state transition diagrams, that are marked
with an arrow followed by the name of the successive state. Transitions caused
by trivial error cases (i.~e. no response from slave) are not described
explicitly. These transitions are drawn as dashed arrows in the diagrams.

%------------------------------------------------------------------------------

\section{The Master State Machine}
\label{sec:fsm-master}
\index{FSM!Master}

The master state machine is executed in the context of the master thread.
Figure~\ref{fig:fsm-master} shows its transition diagram. Its purposes are:

\begin{figure}[htbp]
  \centering
  \includegraphics[width=\textwidth]{graphs/fsm_master}
  \caption{Transition diagram of the master state machine}
  \label{fig:fsm-master}
\end{figure}

\begin{description}

\item[Bus monitoring] The bus topology is monitored. If it changes, the bus is
(re-)scanned.

\item[Slave configuration] The application-layer states of the slaves are
monitored. If a slave is not in the state it supposed to be, the slave is
(re-)configured.

\item[Request handling] Requests (either originating from the application or
from external sources) are handled. A request is a job that the master shall
process asynchronously, for example an SII access, Sdo access, or similar.

\end{description}

%------------------------------------------------------------------------------

\section{The Slave Scan State Machine}
\label{sec:fsm-scan}
\index{FSM!Slave Scan}

The slave scan state machine, which can be seen in
figure~\ref{fig:fsm-slavescan}, leads through the process of reading desired
slave information.

\begin{figure}[htbp]
  \centering
  \includegraphics[height=.8\textheight]{graphs/fsm_slave_scan}
  \caption{Transition diagram of the slave scan state machine}
  \label{fig:fsm-slavescan}
\end{figure}

The scan process includes the following steps:

\begin{description}

\item[Node Address] The node address is set for the slave, so that it can be
node-addressed for all following operations.

\item[AL State] The initial application-layer state is read.

\item[Base Information] Base information (like the number of supported FMMUs)
is read from the lower physical memory.

\item[Data Link] Information about the physical ports is read.

\item[SII Size] The size of the SII contents is determined to allocate SII
image memory.

\item[SII Data] The SII contents are read into the master's image.

\item[PREOP] If the slave supports CoE, it is set to PREOP state using the
State change FSM (see section~\ref{sec:fsm-change}) to enable mailbox
communication and read the Pdo configuration via CoE.

\item[Pdos] The Pdos are read via CoE (if supported) using the Pdo Reading FSM
(see section~\ref{sec:fsm-pdo}). If this is successful, the Pdo information
from the SII (if any) is overwritten.

\end{description}

%------------------------------------------------------------------------------

\section{The Slave Configuration State Machine}
\label{sec:fsm-conf}
\index{FSM!Slave Configuration}

The slave configuration state machine, which can be seen in
figure~\ref{fig:fsm-slaveconf}, leads through the process of configuring a
slave and bringing it to a certain application-layer state.

\begin{figure}[htbp]
  \centering
  \includegraphics[height=.9\textheight]{graphs/fsm_slave_conf}
  \caption{Transition diagram of the slave configuration state
    machine}
  \label{fig:fsm-slaveconf}
\end{figure}

\begin{description}

\item[INIT] The state change FSM is used to bring the slave to the INIT state.

\item[FMMU Clearing] To avoid that the slave reacts on any process data, the
FMMU configuration are cleared. If the slave does not support FMMUs, this
state is skipped. If INIT is the requested state, the state machine is
finished.

\item[Mailbox Sync Manager Configuration] If the slaves support mailbox
communication, the mailbox sync managers are configured. Otherwise this state
is skipped.

\item[PREOP] The state change FSM is used to bring the slave to PREOP state.
If this is the requested state, the state machine is finished.

\item[Sdo Configuration] If there is a slave configuration attached
(see section~\ref{sec:attach}), and there are any Sdo configurations are
provided by the application, these are sent to the slave.

\item[Pdo Configuration] The Pdo configuration state machine is executed to
apply all necessary Pdo configurations.

\item[Pdo Sync Manager Configuration] If any Pdo sync managers exist, they are
configured.

\item[FMMU Configuration] If there are FMMUs configurations supplied by the
application (i.~e. if the application registered Pdo entries), they are
applied. 

\item[SAFEOP] The state change FSM is used to bring the slave to SAFEOP state.
If this is the requested state, the state machine is finished.

\item[OP] The state change FSM is used to bring the slave to OP state.
If this is the requested state, the state machine is finished.

\end{description}

%------------------------------------------------------------------------------

\section{The State Change State Machine}
\label{sec:fsm-change}
\index{FSM!State Change}

The state change state machine, which can be seen in
figure~\ref{fig:fsm-change}, leads through the process of changing a slave's
application-layer state. This implements the states and transitions described
in \cite[section~6.4.1]{alspec}.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.6\textwidth]{graphs/fsm_change}
  \caption{Transition Diagram of the State Change State Machine}
  \label{fig:fsm-change}
\end{figure}

\begin{description}

\item[Start] The new application-layer state is requested via the ``AL Control
Request'' register (see ~\cite[section 5.3.1]{alspec}).

\item[Check for Response] Some slave need some time to respond to an AL state
change command, and do not respond for some time. For this case, the command
is issued again, until it is acknowledged.

\item[Check AL Status] If the AL State change datagram was acknowledged, the
``AL Control Response'' register (see~\cite[section 5.3.2]{alspec}) must be
read out until the slave changes the AL state.

\item[AL Status Code] If the slave refused the state change command, the
reason can be read from the ``AL Status Code'' field in the ``AL State
Changed'' registers (see~\cite[section 5.3.3]{alspec}).

\item[Acknowledge State] If the state change was not successful, the master
has to acknowledge the old state by writing to the ``AL Control request''
register again.

\item[Check Acknowledge] After sending the acknowledge command, it has to read
out the ``AL Control Response'' register again.

\end{description}

The ``start\_ack'' state is a shortcut in the state machine for the case, that
the master wants to acknowledge a spontaneous AL state change, that was not
requested.

%------------------------------------------------------------------------------

\section{The SII State Machine}
\label{sec:fsm-sii}
\index{FSM!SII}

The SII\index{SII} state machine (shown in figure~\ref{fig:fsm-sii})
implements the process of reading or writing SII data via the
Slave Information Interface described in \cite[section~6.4]{dlspec}.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.5\textwidth]{graphs/fsm_sii}
  \caption{Transition Diagram of the SII State Machine}
  \label{fig:fsm-sii}
\end{figure}

This is how the reading part of the state machine works:

\begin{description}

\item[Start Reading] The read request and the requested word address are
written to the SII attribute.

\item[Check Read Command] If the SII read request command has been
acknowledged, a timer is started. A datagram is issued, that reads out the SII
attribute for state and data.

\item[Fetch Data] If the read operation is still busy (the SII is usually
implemented as an E$^2$PROM), the state is read again. Otherwise the data are
copied from the datagram.

\end{description}

The writing part works nearly similar:

\begin{description}

\item[Start Writing] A write request, the target address and the data word are
written to the SII attribute.

\item[Check Write Command] If the SII write request command has been
acknowledged, a timer is started. A datagram is issued, that reads out the SII
attribute for the state of the write operation.

\item[Wait while Busy] If the write operation is still busy (determined by a
minimum wait time and the state of the busy flag), the state machine remains in
this state to avoid that another write operation is issued too early.

\end{description}

%------------------------------------------------------------------------------

\section{The Pdo State Machines}
\label{sec:fsm-pdo}
\index{FSM!Pdo}

The Pdo state machines are a set of state machines that read or write the Pdo
assignment and the Pdo mapping via the ``CoE Communication Area'' described in
\cite[section 5.6.7.4]{alspec}. For the object access, the
CANopen-over-EtherCAT access primitives are used (see
section~\ref{sec:coeimp}), so the slave must support the CoE mailbox protocol.

\paragraph{Pdo Reading FSM} This state machine (fig.~\ref{fig:fsm-pdo-read})
has the purpose to read the complete Pdo configuration of a slave. It reads
the Pdo assignment for each Sync Manager and uses the Pdo Entry Reading FSM
(fig.~\ref{fig:fsm_pdo_entry_read}) to read the mapping for each assigned Pdo.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.4\textwidth]{graphs/fsm_pdo_read}
  \caption{Transition Diagram of the Pdo Reading State Machine}
  \label{fig:fsm-pdo-read}
\end{figure}

Basically it reads the every Sync manager's Pdo assignment Sdo's
(\lstinline+0x1C1x+) number of elements to determine the number of assigned
Pdos for this sync manager and then reads out the subindices of the Sdo to get
the assigned Pdo's indices. When a Pdo index is read, the Pdo Entry Reading
FSM is executed to read the Pdo's mapped Pdo entries.

\paragraph{Pdo Entry Reading FSM} This state machine
(fig.~\ref{fig:fsm_pdo_entry_reading}) reads the Pdo mapping (the Pdo entries)
of a Pdo. It reads the respective mapping Sdo (\lstinline+0x1600+ -
\lstinline+0x17ff+, or \lstinline+0x1a00+ - \lstinline+0x1bff+) for the given
Pdo by reading first the subindex zero (number of elements) to determine the
number of mapped Pdo entries. After that, each subindex is read to get the
mapped Pdo entry index, subindex and bit size.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.4\textwidth]{graphs/fsm_pdo_entry_read}
  \caption{Transition Diagram of the Pdo Entry Reading State Machine}
  \label{fig:fsm-pdo-read}
\end{figure}

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.9\textwidth]{graphs/fsm_pdo_conf}
  \caption{Transition Diagram of the Pdo Configuration State Machine}
  \label{fig:fsm-pdo-read}
\end{figure}

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.4\textwidth]{graphs/fsm_pdo_entry_conf}
  \caption{Transition Diagram of the Pdo Entry Configuration State Machine}
  \label{fig:fsm-pdo-read}
\end{figure}

%------------------------------------------------------------------------------

\chapter{Mailbox Protocol Implementations}
\index{Mailbox}

The EtherCAT master implements the EoE and the CoE mailbox
protocols. See the below section for details.

%------------------------------------------------------------------------------

\section{Ethernet-over-EtherCAT (EoE)}
\label{sec:eoeimp}
\index{EoE}

The EtherCAT master implements the Ethernet-over-EtherCAT mailbox protocol to
enable the tunneling of Ethernet frames to special slaves, that can either
have physical Ethernet ports to forward the frames to, or have an own IP stack
to receive the frames.

\paragraph{Virtual Network Interfaces}

The master creates a virtual EoE network interface for every EoE-capable
slave. These interfaces are called either

\begin{description}

\item[eoeXsY] for a slave without an alias address (see
section~\ref{sec:alias}), where X is the master index and Y is the slave's
ring position, or

\item[eoeXaY] for a slave with a non-zero alias address, where X is the master
index and Y is the decimal alias address.

\end{description}

Frames sent to these interfaces are forwarded to the associated slaves by the
master. Frames, that are received by the slaves, are fetched by the master and
forwarded to the virtual interfaces.

This bears the following advantages:

\begin{itemize}

\item Flexibility: The user can decide, how the EoE-capable slaves are
interconnected with the rest of the world.

\item Standard tools can be used to monitor the EoE activity and to configure
the EoE interfaces.

\item The Linux kernel's layer-2-bridging implementation (according to the
IEEE 802.1D MAC Bridging standard) can be used natively to bridge Ethernet
traffic between EoE-capable slaves.

\item The Linux kernel's network stack can be used to route packets between
EoE-capable slaves and to track security issues, just like having physical
network interfaces.

\end{itemize}

\paragraph{EoE Handlers}

The virtual EoE interfaces and the related functionality is encapsulated in
the \lstinline+ec_eoe_t+ class. An object of this class is called ``EoE
handler''. For example the master does not create the network interfaces
directly: This is done inside the constructor of an EoE handler. An EoE
handler additionally contains a frame queue. Each time, the kernel passes a
new socket buffer for sending via the interface's
\lstinline+hard_start_xmit()+ callback, the socket buffer is queued for
transmission by the EoE state machine (see below). If the queue gets filled
up, the passing of new socket buffers is suspended with a call to
\lstinline+netif_stop_queue()+.

\paragraph{Creation of EoE Handlers}

During bus scanning (see section~\ref{sec:fsm-scan}), the master determines
the supported mailbox protocols foe each slave. This is done by examining the
``Supported Mailbox Protocols'' mask field at word address 0x001C of the
SII\index{SII}. If bit 1 is set, the slave supports the EoE protocol. In this
case, an EoE handler is created for that slave.

\paragraph{EoE State Machine}
\index{FSM!EoE}

Every EoE handler owns an EoE state machine, that is used to send frames to
the corresponding slave and receive frames from the it via the EoE
communication primitives. This state machine is showed in
figure~\ref{fig:fsm-eoe}.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.7\textwidth]{images/fsm-eoe} % FIXME
  \caption{Transition Diagram of the EoE State Machine}
  \label{fig:fsm-eoe}
\end{figure}

% FIXME

\begin{description}
\item[RX\_START] The beginning state of the EoE state machine. A
  mailbox check datagram is sent, to query the slave's mailbox for new
  frames. $\rightarrow$~RX\_CHECK

\item[RX\_CHECK] The mailbox check datagram is received. If the
  slave's mailbox did not contain data, a transmit cycle is started.
  $\rightarrow$~TX\_START

  If there are new data in the mailbox, a datagram is sent to fetch
  the new data. $\rightarrow$~RX\_FETCH

\item[RX\_FETCH] The fetch datagram is received. If the mailbox data
  do not contain a ``EoE Fragment request'' command, the data are
  dropped and a transmit sequence is started.
  $\rightarrow$~TX\_START

  If the received Ethernet frame fragment is the first fragment, a new
  socket buffer is allocated. In either case, the data are copied into
  the correct position of the socket buffer.

  If the fragment is the last fragment, the socket buffer is forwarded
  to the network stack and a transmit sequence is started.
  $\rightarrow$~TX\_START

  Otherwise, a new receive sequence is started to fetch the next
  fragment. $\rightarrow$~RX\_\-START

\item[TX\_START] The beginning state of a transmit sequence. It is
  checked, if the transmission queue contains a frame to send. If not,
  a receive sequence is started. $\rightarrow$~RX\_START

  If there is a frame to send, it is dequeued. If the queue was
  inactive before (because it was full), the queue is woken up with a
  call to \textit{netif\_wake\_queue()}. The first fragment of the
  frame is sent. $\rightarrow$~TX\_SENT

\item[TX\_SENT] It is checked, if the first fragment was sent
  successfully. If the current frame consists of further fragments,
  the next one is sent. $\rightarrow$~TX\_SENT

  If the last fragment was sent, a new receive sequence is started.
  $\rightarrow$~RX\_START
\end{description}

\paragraph{EoE Processing}

To execute the EoE state machine of every active EoE handler, there must be a
cyclic process. The easiest solution would be to execute the EoE state
machines synchronously with the master state machine (see
section~\ref{sec:fsm-master}). This approach has the following disadvantage:

Only one EoE fragment could be sent or received every few cycles. This
causes the data rate to be very low, because the EoE state machines are not
executed in the time between the application cycles. Moreover, the data rate
would be dependent on the period of the application task.

To overcome this problem, an own cyclic process is needed to asynchronously
execute the EoE state machines. For that, the master owns a kernel timer, that
is executed each timer interrupt. This guarantees a constant bandwidth, but
poses the new problem of concurrent access to the master. The locking
mechanisms needed for this are introduced in section~\ref{sec:concurr}.
Section~\ref{sec:concurrency} gives practical implementation examples.

\paragraph{Automatic Configuration}

By default, slaves are left in PREOP state, if no configuration is applied. If
an EoE interface link is set to ``up'', the requested slave's
application-layer state is automatically set to OP.

%------------------------------------------------------------------------------

\section{CANopen-over-EtherCAT (CoE)}
\label{sec:coeimp}
\index{CoE}

The CANopen-over-EtherCAT protocol \cite[section~5.6]{alspec} is used to
configure slaves and exchange data objects on application level.

% FIXME
%
% Download / Upload
% Expedited / Normal
% Segmentung
% Sdo Info Services
%

\ldots

\paragraph{Sdo Download State Machine}

The best time to apply Sdo configurations is during the slave's PREOP
state, because mailbox communication is already possible and slave's
application will start with updating input data in the succeeding
SAFEOP state. Therefore the Sdo configuration has to be part of the
slave configuration state machine (see section~\ref{sec:fsm-conf}): It
is implemented via an Sdo download state machine, that is executed
just before entering the slave's SAFEOP state. In this way, it is
guaranteed that the Sdo configurations are applied each time, the
slave is reconfigured.

The transition diagram of the Sdo Download state machine can be seen
in figure~\ref{fig:fsm-coedown}.

\begin{figure}[htbp]
  \centering
  \includegraphics[width=.9\textwidth]{images/fsm-coedown} % FIXME
  \caption{Transition diagram of the CoE download state machine}
  \label{fig:fsm-coedown}
\end{figure}

% FIXME

\begin{description}
\item[START] The beginning state of the CoE download state
  machine. The ``Sdo Download Normal Request'' mailbox command is
  sent. $\rightarrow$~REQUEST

\item[REQUEST] It is checked, if the CoE download request has been
  received by the slave. After that, a mailbox check command is issued
  and a timer is started. $\rightarrow$~CHECK

\item[CHECK] If no mailbox data is available, the timer is checked.
  \begin{itemize}
  \item If it timed out, the Sdo download is aborted.
    $\rightarrow$~ERROR
  \item Otherwise, the mailbox is queried again.
    $\rightarrow$~CHECK
  \end{itemize}

  If the mailbox contains new data, the response is fetched.
  $\rightarrow$~RESPONSE

\item[RESPONSE] If the mailbox response could not be fetched, the data
  is invalid, the wrong protocol was received, or a ``Abort Sdo
  Transfer Request'' was received, the Sdo download is aborted.
  $\rightarrow$~ERROR

  If a ``Sdo Download Normal Response'' acknowledgement was received,
  the Sdo download was successful. $\rightarrow$~END

\item[END] The Sdo download was successful.

\item[ERROR] The Sdo download was aborted due to an error.

\end{description}

%------------------------------------------------------------------------------

\chapter{User Space}
\label{sec:user}
\index{User space}

% FIXME

For the master runs as a kernel module, accessing it is natively limited to
analyzing Syslog messages and controlling using modutils.

It is necessary to implement further interfaces, that make it easier to access
the master from user space and allow a finer influence. It should be possible
to view and to change special parameters at runtime.

Bus visualization is a second point: For development and debugging purposes it
would be nice, if one could show the connected slaves with a single command.

Another aspect is automatic startup and configuration. If the master is to be
integrated into a running system, it must be able to automatically start with
a persistent configuration.

A last thing is monitoring EtherCAT communication. For debugging purposes,
there had to be a way to analyze EtherCAT datagrams. The best way would be
with a popular network analyzer, like Wireshark \cite{wireshark} (the former
Ethereal) or others.

This section covers all those points and introduces the interfaces and tools
to make all that possible.

%------------------------------------------------------------------------------

\section{Command-line Tool}
\label{sec:ethercat}

% --master

\subsection{Character Devices}
\label{sec:cdev}

Each master instance will get a character device as a user-space interface.
The devices are named \textit{/dev/EtherCATX}, where $X$ is the index of the
master.

% FIXME
% udev
% rights

%------------------------------------------------------------------------------

\subsection{Setting Alias Addresses}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_alias}

%------------------------------------------------------------------------------

\subsection{Displaying the Bus Configuration}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_config}

%------------------------------------------------------------------------------

\subsection{Displaying Process Data}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_data}

%------------------------------------------------------------------------------

\subsection{Setting a Master's Debug Level}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_debug}

%------------------------------------------------------------------------------

\subsection{Configured Domains}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_domains}

%------------------------------------------------------------------------------

\subsection{Master and Ethernet Devices}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_master}

%------------------------------------------------------------------------------

\subsection{Sync Managers, Pdos and Pdo Entries}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_pdos}

%------------------------------------------------------------------------------

\subsection{Sdo Dictionary}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_sdos}

%------------------------------------------------------------------------------

\subsection{Sdo Access}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_download}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_upload}

%------------------------------------------------------------------------------

\subsection{Slaves on the Bus}

Slave information can be gathered with the subcommand \lstinline+slaves+:

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_slaves}

Below is a typical output:

\begin{lstlisting}
$ `\textbf{ethercat slaves}`
0     0:0  PREOP  +  EK1100 Ethernet Kopplerklemme (2A E-Bus)
1  5555:0  PREOP  +  EL3162 2K. Ana. Eingang 0-10V
2  5555:1  PREOP  +  EL4102 2K. Ana. Ausgang 0-10V
3  5555:2  PREOP  +  EL2004 4K. Dig. Ausgang 24V, 0,5A
\end{lstlisting}

%------------------------------------------------------------------------------

\subsection{SII Access}
\label{sec:siiaccess}
\index{SII!Access}

It is possible to directly read or write the complete SII contents of the
slaves. This was introduced for the reasons below:

\begin{itemize}

\item The format of the SII data is still in development and categories can be
added in the future. With read and write access, the complete memory contents
can be easily backed up and restored.

\item Some SII data fields have to be altered (like the alias address). A quick
writing must be possible for that.

\item Through reading access, analyzing category data is possible from user
space.

\end{itemize}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_sii_read}

Reading out SII data is as easy as other commands. Though the data are in
binary format, analysis is easier with a tool like \textit{hexdump}:

\begin{lstlisting}
$ `\textbf{ethercat sii\_read --position 3 | hexdump}`
0000000 0103 0000 0000 0000 0000 0000 0000 008c
0000010 0002 0000 3052 07f0 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
...
\end{lstlisting}

Backing up SII contents can easily done with a redirection:

\begin{lstlisting}
$ `\textbf{ethercat sii\_read --position 3 > sii-of-slave3.bin}`
\end{lstlisting}

To download SII contents to a slave, writing access to the master's character
device is necessary (see section~\ref{sec:cdev}).

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_sii_write}

\begin{lstlisting}
# `\textbf{ethercat sii\_write --position 3 sii-of-slave3.bin}`
\end{lstlisting}

The SII contents will be checked for validity and then sent to the slave. The
write operation may take a few seconds.

%------------------------------------------------------------------------------

\subsection{Requesting Application-Layer States}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_states}

%------------------------------------------------------------------------------

\subsection{Generating Slave Description XML}

\lstinputlisting[basicstyle=\ttfamily\footnotesize]{external/ethercat_xml}

%------------------------------------------------------------------------------

\section{System Integration}
\label{sec:system}

To integrate the EtherCAT master as a service into a running system, it comes
with an init script and a sysconfig file, that are described below.

\subsection{Init Script}
\label{sec:init}
\index{Init script}

The EtherCAT master init script conforms to the requirements of the ``Linux
Standard Base'' (LSB\index{LSB}, \cite{lsb}). The script is installed to
\textit{etc/init.d/ethercat} below the installation prefix and has to be
copied (or better: linked) to the appropriate location (see
section~\ref{sec:install}), before the master can be inserted as a service.
Please note, that the init script depends on the sysconfig file described
below.

To provide service dependencies (i.~e. which services have to be started before
others) inside the init script code, LSB defines a special comment block.
System tools can extract this information to insert the EtherCAT init script at
the correct place in the startup sequence:

\lstinputlisting[firstline=38,lastline=48]
    {../script/init.d/ethercat}

\subsection{Sysconfig File}
\label{sec:sysconfig}
\index{Sysconfig file}

For persistent configuration, the init script uses a sysconfig file installed
to \textit{etc/sysconfig/ethercat} (below the installation prefix), that is
mandatory for the init script. The sysconfig file contains all configuration
variables needed to operate one or more masters. The documentation is inside
the file and included below:

\lstinputlisting[numbers=left,firstline=9,basicstyle=\ttfamily\scriptsize]
    {../script/sysconfig/ethercat}

\subsection{Starting the Master as a Service}
\label{sec:service}
\index{Service}

After the init script and the sysconfig file are placed into the right
location, the EtherCAT master can be inserted as a service. The different Linux
distributions offer different ways to mark a service for starting and stopping
in certain runlevels. For example, SUSE Linux provides the \textit{insserv}
command:

\begin{lstlisting}
# `\textbf{insserv ethercat}`
\end{lstlisting}

The init script can also be used for manually starting and stopping
the EtherCAT master. It has to be executed with one of the parameters
\texttt{start}, \texttt{stop}, \texttt{restart} or \texttt{status}.

\begin{lstlisting}[gobble=2]
  # `\textbf{/etc/init.d/ethercat restart}`
  Shutting down EtherCAT master                done
  Starting EtherCAT master                     done
\end{lstlisting}

%------------------------------------------------------------------------------

\section{Monitoring and Debugging}
\label{sec:debug}
\index{Monitoring}

% FIXME

For debugging purposes, every EtherCAT master registers a read-only network
interface \textit{ecX}, where X is a number, provided by the kernel on device
registration. While it is ``up'', the master forwards every frame sent and
received to this interface.

This makes it possible to connect an network monitor (like Wireshark or
tcpdump) to the debug interface and monitor the EtherCAT frames.

% FIXME schedule()
It has to be considered, that can be frame rate can be very high. The master
state machine usually runs every kernel timer interrupt (usually up to
\unit{1}{\kilo\hertz}) and with a connected application, the rate can be even
higher.

\paragraph{Attention:} The socket buffers needed for the operation of
the debugging interface have to be allocated dynamically. Some Linux
realtime extensions do not allow this in realtime context!

%------------------------------------------------------------------------------

\chapter{Timing Aspects}
\label{sec:timing}

Although EtherCAT's timing is highly deterministic and therefore timing issues
are rare, there are a few aspects that can (and should be) dealt with.

%------------------------------------------------------------------------------

\subsection{Application Interface Profiling}
\label{sec:timing-profile}
\index{Realtime!Profiling}

One of the most important timing aspects are the execution times of the
realtime interface functions, that are called in cyclic context. These
functions make up an important part of the overall timing of the application.
To measure the timing of the functions, the following code was used:

\begin{lstlisting}[gobble=2,language=C]
  c0 = get_cycles();
  ecrt_master_receive(master);
  c1 = get_cycles();
  ecrt_domain_process(domain1);
  c2 = get_cycles();
  ecrt_master_run(master);
  c3 = get_cycles();
  ecrt_master_send(master);
  c4 = get_cycles();
\end{lstlisting}

Between each call of an interface function, the CPU timestamp counter is read.
The counter differences are converted to \micro\second\ with help of the
\lstinline+cpu_khz+ variable, that contains the number of increments per
\milli\second.

For the actual measuring, a system with a \unit{2.0}{\giga\hertz} CPU was used,
that ran the above code in an RTAI thread with a period of
\unit{100}{\micro\second}. The measuring was repeated $n = 100$ times and the
results were averaged. These can be seen in table~\ref{tab:profile}.

\begin{table}[htpb]
  \centering
  \caption{Profiling of a Realtime Cycle on a \unit{2.0}{\giga\hertz}
  Processor}
  \label{tab:profile}
  \vspace{2mm}
  \begin{tabular}{l|r|r}
    Element & Mean Duration [\second] & Standard Deviancy [\micro\second] \\
    \hline
    \textit{ecrt\_master\_receive()} & 8.04 & 0.48\\
    \textit{ecrt\_domain\_process()} & 0.14 & 0.03\\
    \textit{ecrt\_master\_run()} & 0.29 & 0.12\\
    \textit{ecrt\_master\_send()} & 2.18 & 0.17\\ \hline
    Complete Cycle & 10.65 & 0.69\\ \hline
  \end{tabular}
\end{table}

It is obvious, that the functions accessing hardware make up the
lion's share. The \textit{ec\_master\_receive()} executes the ISR of
the Ethernet device, analyzes datagrams and copies their contents into
the memory of the datagram objects. The \textit{ec\_master\_send()}
assembles a frame out of different datagrams and copies it to the
hardware buffers. Interestingly, this makes up only a quarter of the
receiving time.

The functions that only operate on the masters internal data structures are
very fast ($\Delta t < \unit{1}{\micro\second}$). Interestingly the runtime of
\textit{ec\_domain\_process()} has a small standard deviancy relative to the
mean value, while this ratio is about twice as big for
\textit{ec\_master\_run()}: This probably results from the latter function
having to execute code depending on the current state and the different state
functions are more or less complex.

For a realtime cycle makes up about \unit{10}{\micro\second}, the theoretical
frequency can be up to \unit{100}{\kilo\hertz}. For two reasons, this frequency
keeps being theoretical:

\begin{enumerate}

\item The processor must still be able to run the operating system between the
realtime cycles.

\item The EtherCAT frame must be sent and received, before the next realtime
cycle begins. The determination of the bus cycle time is difficult and covered
in section~\ref{sec:timing-bus}.

\end{enumerate}

%------------------------------------------------------------------------------

\subsection{Bus Cycle Measuring}
\label{sec:timing-bus}
\index{Bus cycle}

For measuring the time, a frame is ``on the wire'', two timestamps
must be be taken:

\begin{enumerate}
\item The time, the Ethernet hardware begins with physically sending
  the frame.
\item The time, the frame is completely received by the Ethernet
  hardware.
\end{enumerate}

Both times are difficult to determine. The first reason is, that the
interrupts are disabled and the master is not notified, when a frame
is sent or received (polling would distort the results). The second
reason is, that even with interrupts enabled, the time from the event
to the notification is unknown. Therefore the only way to confidently
determine the bus cycle time is an electrical measuring.

Anyway, the bus cycle time is an important factor when designing realtime code,
because it limits the maximum frequency for the cyclic task of the application.
In practice, these timing parameters are highly dependent on the hardware and
often a trial and error method must be used to determine the limits of the
system.

The central question is: What happens, if the cycle frequency is too high? The
answer is, that the EtherCAT frames that have been sent at the end of the cycle
are not yet received, when the next cycle starts.  First this is noticed by
\textit{ecrt\_domain\_process()}, because the working counter of the process
data datagrams were not increased. The function will notify the user via
Syslog\footnote{To limit Syslog output, a mechanism has been implemented, that
outputs a summarized notification at maximum once a second.}. In this case, the
process data keeps being the same as in the last cycle, because it is not
erased by the domain. When the domain datagrams are queued again, the master
notices, that they are already queued (and marked as sent). The master will
mark them as unsent again and output a warning, that datagrams were
``skipped''.

On the mentioned \unit{2.0}{\giga\hertz} system, the possible cycle frequency
can be up to \unit{25}{\kilo\hertz} without skipped frames. This value can
surely be increased by choosing faster hardware. Especially the RealTek network
hardware could be replaced by a faster one. Besides, implementing a dedicated
ISR for EtherCAT devices would also contribute to increasing the latency. These
are two points on the author's to-do list.

%------------------------------------------------------------------------------

\chapter{Installation}
\label{sec:installation}
\index{Master!Installation}

\section{Building the Software}

The current EtherCAT master code is available at~\cite{etherlab} or can be
obtained from the EtherLab CD. The \textit{tar.bz2} file has to be unpacked
with the commands below (or similar):

\begin{lstlisting}[gobble=2]
  $ `\textbf{tar xjf ethercat-\masterversion.tar.bz2}`
  $ `\textbf{cd ethercat-\masterversion/}`
\end{lstlisting}

The tarball was created with GNU Autotools, so the build process
follows the below commands:

\begin{lstlisting}[gobble=2]
  $ `\textbf{./configure}`
  $ `\textbf{make}`
  $ `\textbf{make modules}`
\end{lstlisting}

Table~\ref{tab:config} lists important configuration switches and options.

\begin{table}
  \caption{Configuration options}
  \label{tab:config}
  \vspace{2mm}
  \begin{tabular}{l|p{.3\textwidth}|l}

\bf Option/Switch & \bf Description & \bf Default\\\hline

\lstinline+--prefix+ & Installation prefix & \textit{/opt/etherlab}\\

\lstinline+--with-linux-dir+ & Linux kernel sources & Use running kernel\\

\lstinline+--with-rtai-dir+ & RTAI path (only for RTAI example) & \\

\hline

\lstinline+--enable-eoe+ & Enable EoE support & yes\\

\lstinline+--enable-cycles+ & Use CPU timestamp counter. Enable this on Intel
architecture to get finer timing calculation. & no\\

\lstinline+--enable-debug-if+ & Create a debug interface for each master & no\\

\lstinline+--enable-debug-ring+ & Create a debug ring to record frames & no\\

\hline

\lstinline+--enable-8139too+ & Build the 8139too driver & yes\\

\lstinline+--with-8139too-kernel+ & 8139too kernel & $\dagger$\\

\lstinline+--enable-e100+ & Build the e100 driver & no\\

\lstinline+--with-e100-kernel+ & e100 kernel & $\dagger$\\

\lstinline+--enable-forcedeth+ & Enable forcedeth driver & no\\

\lstinline+--with-forcedeth-kernel+ & forcedeth kernel & $\dagger$\\

\lstinline+--enable-e1000+ & Enable e1000 driver & no\\

\lstinline+--with-e1000-kernel+ & e1000 kernel & $\dagger$\\

\lstinline+--enable-r8169+ & Enable r8169 driver & no\\

\lstinline+--with-r8169-kernel+ & r8169 kernel & $\dagger$\\

  \end{tabular}
  \vspace{2mm}

\begin{description}

\item[$\dagger$] If this option is not specified, the kernel version to use is
extracted from the Linux kernel sources.

\end{description}

\end{table}

\section{Building the Interface Documentation}
\label{sec:gendoc}

The source code is documented using Doxygen~\cite{doxygen}. To build the HTML
documentation, the Doxygen software has to be installed. The below command
will generate the documents in the subdirectory \textit{doxygen-output}:

\begin{lstlisting}
$ `\textbf{make doc}`
\end{lstlisting}

The interface documentation can be viewed by pointing a browser to
\textit{doxygen-output/html/index.html}.

\section{Installing the Software}

The below commands have to be entered as \textit{root}: The first one will
install the EtherCAT header, init script, sysconfig file and the user space
tool to the prefix path. The second one will install the kernel modules to the
kernel's modules directory. The final \lstinline+depmod+ call is necessary to
include the kernel modules into the \textit{modules.dep} file to make it
available to the \lstinline+modprobe+ command, used in the init script. 

\begin{lstlisting}
# `\textbf{make install}`
# `\textbf{make modules\_install}`
# `\textbf{depmod}`
\end{lstlisting}

If the target kernel's modules directory is not under \textit{/lib/modules}, a
different destination directory can be specified with the \lstinline+DESTDIR+
make variable. For example:

\begin{lstlisting}
# `\textbf{make DESTDIR=/vol/nfs/root modules\_install}`
\end{lstlisting}

This command will install the compiled kernel modules to
\textit{/vol/nfs/root/lib/modules}, prepended by the kernel release.

If the EtherCAT master shall be run as a service\footnote{Even if the EtherCAT
master shall not be loaded on system startup, the use of the init script is
recommended for manual (un-)loading.} (see section~\ref{sec:system}), the init
script and the sysconfig file have to be copied (or linked) to the appropriate
locations. The below example is suitable for SUSE Linux. It may vary for other
distributions.

% FIXME relative ln -s?
\begin{lstlisting}
# `\textbf{cd /opt/etherlab}`
# `\textbf{cp etc/sysconfig/ethercat /etc/sysconfig/}`
# `\textbf{ln -s etc/init.d/ethercat /etc/init.d/}`
# `\textbf{insserv ethercat}`
\end{lstlisting}

Now the sysconfig file \texttt{/etc/sysconfig/ethercat} (see
section~\ref{sec:sysconfig}) has to be customized. The minimal customization
is to set the \lstinline+MASTER0_DEVICE+ variable to the MAC address of the
Ethernet device to use (or \lstinline+ff:ff:ff:ff:ff:ff+ to use the first
device offered) and selecting the driver(s) to load via the
\lstinline+DEVICE_MODULES+ variable.

After the basic configuration is done, the master can be started with
the below command:

\begin{lstlisting}
# `\textbf{/etc/init.d/ethercat start}`
\end{lstlisting}

The operation of the master can be observed with the command
\lstinline+ethercat master+ or by viewing the Syslog\index{Syslog}
messages, which should look like the ones below. If EtherCAT slaves are
connected to the master's EtherCAT device, the activity indicators should
begin to flash.

\begin{lstlisting}[numbers=left]
EtherCAT: Master driver `\masterversion`
EtherCAT: 1 master waiting for devices.
EtherCAT Intel(R) PRO/1000 Network Driver - version 6.0.60-k2
Copyright (c) 1999-2005 Intel Corporation.
PCI: Found IRQ 12 for device 0000:01:01.0
PCI: Sharing IRQ 12 with 0000:00:1d.2
PCI: Sharing IRQ 12 with 0000:00:1f.1
EtherCAT: Accepting device 00:0E:0C:DA:A2:20 for master 0.
EtherCAT: Starting master thread.
ec_e1000: ec0: e1000_probe: Intel(R) PRO/1000 Network
          Connection
ec_e1000: ec0: e1000_watchdog_task: NIC Link is Up 100 Mbps
          Full Duplex
EtherCAT: Link state changed to UP.
EtherCAT: 7 slave(s) responding.
EtherCAT: Slave states: PREOP.
EtherCAT: Scanning bus.
EtherCAT: Bus scanning completed in 431 ms.
\end{lstlisting}

\begin{description}

\item[\linenum{1} -- \linenum{2}] The master module is loading, and one master
is initialized.

\item[\linenum{3} -- \linenum{8}] The EtherCAT-capable e1000 driver is
loading. The master accepts the device with the address
\lstinline+00:0E:0C:DA:A2:20+.

\item[\linenum{9} -- \linenum{16}] The master goes to idle phase, starts its
state machine and begins scanning the bus.

\end{description}

%------------------------------------------------------------------------------

\chapter{Application Examples}
\label{chapter:examples}

This chapter will give practical examples of how to use the EtherCAT master
via the realtime interface by writing an application module.

% FIXME remove examples?

%------------------------------------------------------------------------------

\section{Minimal Example}
\label{sec:mini}
\index{Examples!Minimal}

This section will explain the use of the EtherCAT master from a minimal kernel
module. The complete module code is obtainable as a part of the EtherCAT master
code release (see~\cite{etherlab}, file \textit{examples/mini/mini.c}).

The minimal example uses a kernel timer (software interrupt) to generate a
cyclic task. After the timer function is executed, it re-adds itself with a
delay of one \textit{jiffy}\index{jiffies}, which results in a timer frequency
of \textit{HZ}\nomenclature{HZ}{Kernel macro containing the timer interrupt
frequency}

The module-global variables, needed to operate the master can be seen
in listing~\ref{lst:minivar}.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Minimal
    variables},label=lst:minivar]
  struct timer_list timer;

  ec_master_t *master = NULL;
  ec_domain_t *domain1 = NULL;

  void *r_dig_in, *r_ana_out;

  ec_pdo_reg_t domain1_pdos[] = {
          {"1", Beckhoff_EL1014_Inputs, &r_dig_in},
          {"2", Beckhoff_EL4132_Ouput1, &r_ana_out},
          {}
  };
\end{lstlisting}

\begin{description}
\item[\linenum{1}] There is a timer object
  declared, that is needed to tell the kernel to install a timer and
  execute a certain function, if it runs out. This is done by a
  variable of the \textit{timer\_list} structure.
\item[\linenum{3} -- \linenum{4}] There
  is a pointer declared, that will later point to a requested EtherCAT
  master.  Additionally there is a pointer to a domain object needed,
  that will manage process data IO.
\item[\linenum{6}] The pointers \textit{r\_*}
  will later point to the \underline{r}aw process data values inside
  the domain memory. The addresses they point to will be set during a
  call to \textit{ec\_\-master\_\-activate()}, that will create the
  domain memory and configure the mapped process data image.
\item[\linenum{8} -- \linenum{12}] The
  configuration of the mapping of certain Pdos in a domain can easily
  be done with the help of an initialization array of the
  \textit{ec\_pdo\_reg\_t} type, defined as part of the realtime
  interface. Each record must contain the ASCII bus-address of the
  slave (see section~\ref{sec:addr}), the slave's vendor ID and
  product code, and the index and subindex of the Pdo to map (these
  four fields can be specified in junction, by using one of the
  defines out of the \textit{include/ecdb.h} header). The last field
  has to be the address of the process data pointer, so it can later
  be redirected appropriately. Attention: The initialization array
  must end with an empty record (\textit{\{\}})!
\end{description}

The initialization of the minimal application is done by the ``Minimal init
function'' in listing~\ref{lst:miniinit}.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Minimal init
    function},label={lst:miniinit}]
  int __init init_mini_module(void)
  {
          if (!(master = ecrt_request_master(0))) {
                  goto out_return;
          }

          if (!(domain1 = ecrt_master_create_domain(master))) {
                  goto out_release_master;
          }

          if (ecrt_domain_register_pdo_list(domain1,
                                            domain1_pdos)) {
                  goto out_release_master;
          }

          if (ecrt_master_activate(master)) {
                  goto out_release_master;
          }

          ecrt_master_prepare(master);

          init_timer(&timer);
          timer.function = run;
          timer.expires = jiffies + 10;
          add_timer(&timer);

          return 0;

        out_release_master:
          ecrt_release_master(master);
        out_return:
          return -1;
  }
\end{lstlisting}

\begin{description}
\item[\linenum{3}] It is tried to request the
  first EtherCAT master (index 0). On success, the
  \textit{ecrt\_\-request\_\-master()} function returns a pointer to
  the reserved master, that can be used as an object to following
  functions calls. On failure, the function returns \textit{NULL}.
\item[\linenum{7}] In order to exchange process
  data, a domain object has to be created. The
  \textit{ecrt\_\-master\_\-create\_domain()} function also returns a
  pointer to the created domain, or \textit{NULL} in error case.
\item[\linenum{11}] The registration of domain
  Pdos with an initialization array results in a single function call.
  Alternatively the data fields could be registered with individual
  calls of \textit{ecrt\_domain\_register\_pdo()}.
\item[\linenum{16}] After the configuration of
  process data mapping, the master can be activated for cyclic
  operation. This will configure all slaves and bring them into
  OP state.
\item[\linenum{20}] This call is needed to avoid
  a case differentiation in cyclic operation: The first operation in
  cyclic mode is a receive call. Due to the fact, that there is
  nothing to receive during the first cycle, there had to be an
  \textit{if}-statement to avoid a warning. A call to
  \textit{ec\_master\_prepare()} sends a first datagram containing a
  process data exchange datagram, so that the first receive call will
  not fail.
\item[\linenum{22} -- \linenum{25}] The
  master is now ready for cyclic operation. The kernel timer that
  cyclically executes the \textit{run()} function is initialized and
  started.
\end{description}

The coding of a cleanup function fo the minimal module can be seen in
listing~\ref{lst:miniclean}.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Minimal cleanup
    function},label={lst:miniclean}]
  void __exit cleanup_mini_module(void)
  {
          del_timer_sync(&timer);
          ecrt_master_deactivate(master);
          ecrt_release_master(master);
  }
\end{lstlisting}

\begin{description}
\item[\linenum{3}] To cleanup the module, it it
  necessary to stop the cyclic processing. This is done by a call to
  \textit{del\_timer\_sync()} which safely removes a queued timer
  object. It is assured, that no cyclic work will be done after this
  call returns.
\item[\linenum{4}] This call deactivates the
  master, which results in all slaves being brought to their INIT
  state again.
\item[\linenum{5}] This call releases the master,
  removes any existing configuration and silently starts the idle
  mode. The value of the master pointer is invalid after this call and
  the module can be safely unloaded.
\end{description}

The final part of the minimal module is that for the cyclic work. Its
coding can be seen in listing~\ref{lst:minirun}.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Minimal cyclic
    function},label={lst:minirun}]
  void run(unsigned long data)
  {
          static uint8_t dig_in_0;

          ecrt_master_receive(master);
          ecrt_domain_process(domain1);

          dig_in_0 = EC_READ_BIT(r_dig_in, 0);
          EC_WRITE_S16(r_ana_out, dig_in_0 * 0x3FFF);

          ecrt_master_run(master);
          ecrt_master_send(master);

          timer.expires += 1; // frequency = HZ
          add_timer(&timer);
  }
\end{lstlisting}

\begin{description}

\item[\linenum{5}] The cyclic processing starts with receiving datagrams, that
were sent in the last cycle. The frames containing these datagrams have to be
received by the network interface card prior to this call.

\item[\linenum{6}] The process data of domain 1 has been automatically copied
into domain memory while datagram reception. This call checks the working
counter for changes and re-queues the domain's datagram for sending.

\item[\linenum{8}] This is an example for reading out a bit-oriented process
data value (i.~e. bit 0) via the \textit{EC\_READ\_BIT()} macro. See
section~\ref{sec:macros} for more information about those macros.

\item[\linenum{9}] This line shows how to write a signed, 16-bit process data
value. In this case, the slave is able to output voltages of
\unit{-10--+10}{\volt} with a resolution of \unit{16}{bit}.  This write command
outputs either \unit{0}{\volt} or \unit{+5}{\volt}, depending of the value of
\textit{dig\_in\_0}.

\item[\linenum{11}] This call runs the master's operation state machine (see
section~\ref{sec:fsm-op}). A single state is processed, and datagrams are
queued. Mainly bus observation is done: The bus state is determined and in case
of slaves that lost their configuration, reconfiguration is tried.

\item[\linenum{12}] This method sends all queued datagrams, in this case the
domain's datagram and one of the master state machine. In best case, all
datagrams fit into one frame.

\item[\linenum{14} -- \linenum{15}] Kernel timers are implemented as
``one-shot'' timers, so they have to be re-added after each execution. The time
of the next execution is specified in \textit{jiffies} and will happen at the
time of the next system timer interrupt. This results in the \textit{run()}
function being executed with a frequency of \textit{HZ}.

\end{description}

%------------------------------------------------------------------------------

\section{RTAI Example}
\label{sec:rtai}
\index{Examples!RTAI}

The whole code can be seen in the EtherCAT master code release
(see~\cite{etherlab}, file \textit{examples/rtai/rtai\_sample.c}).

Listing~\ref{lst:rtaivar} shows the defines and global variables
needed for a minimal RTAI module with EtherCAT processing.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI task
    declaration},label={lst:rtaivar}]
  #define FREQUENCY 10000
  #define TIMERTICKS (1000000000 / FREQUENCY)

  RT_TASK task;
\end{lstlisting}

\begin{description}
\item[\linenum{1} -- \linenum{2}] RTAI
  takes the cycle period as nanoseconds, so the easiest way is to
  define a frequency and convert it to a cycle time in nanoseconds.
\item[\linenum{4}] The \textit{task} variable
  later contains information about the running RTAI task.
\end{description}

Listing~\ref{lst:rtaiinit} shows the module init function for the RTAI
module. Most lines are the same as in listing~\ref{lst:miniinit},
differences come up when starting the cyclic code.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI module init
    function},label={lst:rtaiinit}]
  int __init init_mod(void)
  {
          RTIME requested_ticks, tick_period, now;

          if (!(master = ecrt_request_master(0))) {
                  goto out_return;
          }

          if (!(domain1 = ecrt_master_create_domain(master))) {
                  goto out_release_master;
          }

          if (ecrt_domain_register_pdo_list(domain1,
                                            domain1_pdos)) {
                  goto out_release_master;
          }

          if (ecrt_master_activate(master)) {
                  goto out_release_master;
          }

          ecrt_master_prepare(master);

          requested_ticks = nano2count(TIMERTICKS);
          tick_period = start_rt_timer(requested_ticks);

          if (rt_task_init(&task, run, 0, 2000, 0, 1, NULL)) {
                  goto out_stop_timer;
          }

          now = rt_get_time();
          if (rt_task_make_periodic(&task, now + tick_period,
                                    tick_period)) {
                  goto out_stop_task;
          }

          return 0;

      out_stop_task:
          rt_task_delete(&task);
      out_stop_timer:
          stop_rt_timer();
      out_deactivate:
          ecrt_master_deactivate(master);
      out_release_master:
          ecrt_release_master(master);
      out_return:
          return -1;
  }
\end{lstlisting}

\begin{description}
\item[\linenum{24} -- \linenum{25}] The
  nanoseconds are converted to RTAI timer ticks and an RTAI timer is
  started.  \textit{tick\_period} will be the ``real'' number of ticks
  used for the timer period (which can be different to the requested
  one).
\item[\linenum{27}] The RTAI task is initialized
  by specifying the cyclic function, the parameter to hand over, the
  stack size, priority, a flag that tells, if the function will use
  floating point operations and a signal handler.
\item[\linenum{32}] The task is made periodic by
  specifying a start time and a period.
\end{description}

The cleanup function of the RTAI module in listing~\ref{lst:rtaiclean}
is nearly as simple as that of the minimal module.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI module
    cleanup function},label={lst:rtaiclean}]
  void __exit cleanup_mod(void)
  {
          rt_task_delete(&task);
          stop_rt_timer();
          ecrt_master_deactivate(master);
          ecrt_release_master(master);
          rt_sem_delete(&master_sem);
  }
\end{lstlisting}

\begin{description}
\item[\linenum{2}] The RTAI task will be stopped
  and deleted.
\item[\linenum{3}] After that, the RTAI timer can
  be stopped.
\end{description}

The rest is the same as for the minimal module.

Worth to mention is, that the cyclic function of the RTAI module
(listing~\ref{lst:rtairun}) has a slightly different architecture. The
function is not executed until returning for every cycle, but has an
infinite loop in it, that is placed in a waiting state for the rest of
each cycle.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI module cyclic
    function},label={lst:rtairun}]
  void run(long data)
  {
          while (1) {
                  ecrt_master_receive(master);
                  ecrt_domain_process(domain1);

                  k_pos = EC_READ_U32(r_ssi_input);

                  ecrt_master_run(master);
                  ecrt_master_send(master);

                  rt_task_wait_period();
          }
  }
\end{lstlisting}

\begin{description}
\item[\linenum{3}] The \textit{while (1)} loop
  executes for the lifetime of the RTAI task.
\item[\linenum{12}] The
  \textit{rt\_task\_wait\_period()} function sets the process into a
  sleeping state until the beginning of the next cycle. It also
  checks, if the cyclic function has to be terminated.
\end{description}

%------------------------------------------------------------------------------

\section{Concurrency Example}
\label{sec:concurrency}
\index{Examples!Concurrency}

As mentioned before, there can be concurrent access to the EtherCAT master. The
application and a EoE\index{EoE} process can compete for master access, for
example. In this case, the module has to provide the locking mechanism, because
it depends on the module's architecture which lock has to be used. The module
makes this locking mechanism available to the master through the master's
locking callbacks.

In case of RTAI, the lock can be an RTAI semaphore, as shown in
listing~\ref{lst:convar}. A normal Linux semaphore would not be appropriate,
because it could not block the RTAI task due to RTAI running in a higher domain
than the Linux kernel (see~\cite{rtai}).

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI semaphore for
    concurrent access},label={lst:convar}]
  SEM master_sem;
\end{lstlisting}

The module has to implement the two callbacks for requesting and
releasing the master lock. An exemplary coding can be seen in
listing~\ref{lst:conlock}.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI locking
    callbacks for concurrent access},label={lst:conlock}]
  int request_lock(void *data)
  {
          rt_sem_wait(&master_sem);
          return 0;
  }

  void release_lock(void *data)
  {
          rt_sem_signal(&master_sem);
  }
\end{lstlisting}

\begin{description}
\item[\linenum{1}] The \textit{request\_lock()}
  function has a data parameter. The master always passes the value,
  that was specified when registering the callback function. This can
  be used for handing the master pointer. Notice, that it has an
  integer return value (see line 4).
\item[\linenum{3}] The call to
  \textit{rt\_sem\_wait()} either returns at once, when the semaphore
  was free, or blocks until the semaphore is freed again. In any case,
  the semaphore finally is reserved for the process calling the
  request function.
\item[\linenum{4}] When the lock was requested
  successfully, the function should return 0. The module can prohibit
  requesting the lock by returning non-zero (see paragraph ``Tuning
  the jitter'' below).
\item[\linenum{7}] The \textit{release\_lock()}
  function gets the same argument passed, but has a void return value,
  because is always succeeds.
\item[\linenum{9}] The \textit{rt\_sem\_signal()}
  function frees the semaphore, that was prior reserved with
  \textit{rt\_sem\_wait()}.
\end{description}

In the module's init function, the semaphore must be initialized, and
the callbacks must be passed to the EtherCAT master:

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Module init
    function for concurrent access},label={lst:coninit}]
  int __init init_mod(void)
  {
          RTIME tick_period, requested_ticks, now;

          rt_sem_init(&master_sem, 1);

          if (!(master = ecrt_request_master(0))) {
                  goto out_return;
          }

          ecrt_master_callbacks(master, request_lock,
                                release_lock, NULL);
          // ...
\end{lstlisting}

\begin{description}
\item[\linenum{5}] The call to
  \textit{rt\_sem\_init()} initializes the semaphore and sets its
  value to 1, meaning that only one process can reserve the semaphore
  without blocking.
\item[\linenum{11}] The callbacks are passed to
  the master with a call to \textit{ecrt\_master\_callbacks()}. The
  last parameter is the argument, that the master should pass with
  each call to a callback function. Here it is not used and set to
  \textit{NULL}.
\end{description}

For the cyclic function being only one competitor for master access,
it has to request the lock like any other process. There is no need to
use the callbacks (which are meant for processes of lower priority),
so it can access the semaphore directly:

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI cyclic
    function for concurrent access},label={lst:conrun}]
  void run(long data)
  {
          while (1) {
                  rt_sem_wait(&master_sem);

                  ecrt_master_receive(master);
                  ecrt_domain_process(domain1);

                  k_pos = EC_READ_U32(r_ssi_input);

                  ecrt_master_run(master);
                  ecrt_master_send(master);

                  rt_sem_signal(&master_sem);
                  rt_task_wait_period();
          }
  }
\end{lstlisting}

\begin{description}

\item[\linenum{4}] Every access to the master has to be preceded by a call to
\textit{rt\_sem\_wait()}, because another instance might currently access the
master.

\item[\linenum{14}] When cyclic processing finished, the semaphore has to be
freed again, so that other processes have the possibility to access the master.

\end{description}

A little change has to be made to the cleanup function in case of
concurrent master access.

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={RTAI module
    cleanup function for concurrent access},label={lst:conclean}]
  void __exit cleanup_mod(void)
  {
          rt_task_delete(&task);
          stop_rt_timer();
          ecrt_master_deactivate(master);
          ecrt_release_master(master);
          rt_sem_delete(&master_sem);
  }
\end{lstlisting}

\begin{description}
\item[\linenum{7}] Upon module cleanup, the
  semaphore has to be deleted, so that memory can be freed.
\end{description}

\paragraph{Tuning the Jitter}
\index{Jitter}

Concurrent access leads to higher jitter for the application task, because
there are situations, in which the task has to wait for a process of lower
priority to finish accessing the master.  In most cases this is acceptable,
because a master access cycle (receive/process/send) only takes
\unit{10-20}{\micro\second} on recent systems, what would be the maximum
additional jitter. However some applications demand a minimum jitter. For this
reason the master access can be prohibited by the application: If the time,
another process wants to access the master, is to close to the beginning of the
next application cycle, the module can disallow, that the lock is taken. In
this case, the request callback has to return $1$, meaning that the lock has
not been taken. The foreign process must abort its master access and try again
next time.

This measure helps to significantly reducing the jitter produced by concurrent
master access. Below are excerpts of an example coding:

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Variables for
    jitter reduction},label={lst:redvar}]
  #define FREQUENCY 10000 // RTAI task frequency in Hz
  // ...
  cycles_t t_last_cycle = 0;
  const cycles_t t_critical = cpu_khz * 1000 / FREQUENCY
                              - cpu_khz * 30 / 1000;
\end{lstlisting}

\begin{description}

\item[\linenum{3}] The variable \textit{t\_last\_cycle} holds the timer ticks
at the beginning of the last realtime cycle.

\item[\linenum{4}] \textit{t\_critical} contains the number of ticks, that may
have passed since the beginning of the last cycle, until there is no more
foreign access possible. It is calculated by subtracting the ticks for
\unit{30}{\micro\second} from the ticks for a complete cycle.

\end{description}

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Cyclic function
    with reduced jitter},label={lst:redrun}]
  void run(long data)
  {
          while (1) {
                  t_last_cycle = get_cycles();
                  rt_sem_wait(&master_sem);
                  // ...
\end{lstlisting}

\begin{description}
\item[\linenum{4}] The ticks of the beginning of
  the current realtime cycle are taken before reserving the semaphore.
\end{description}

\begin{lstlisting}[gobble=2,language=C,numbers=left,caption={Request callback
    for reduced jitter},label={lst:redreq}]
  int request_lock(void *data)
  {
          // too close to the next RT cycle: deny access.
          if (get_cycles() - t_last_cycle > t_critical)
                  return -1;

          // allow access
          rt_sem_wait(&master_sem);
          return 0;
  }
\end{lstlisting}

\begin{description}

\item[\linenum{4}] If the time of request is too close to the next realtime
cycle (here: \unit{<30}{\micro\second} before the estimated beginning), the
locking is denied. The requesting process must abort its cycle.

\end{description}

%------------------------------------------------------------------------------

\begin{thebibliography}{99}

\bibitem{etherlab} Ingenieurgemeinschaft IgH: EtherLab -- Open Source Toolkit
for rapid realtime code generation under Linux with Simulink/RTW and EtherCAT
technology. \url{http://etherlab.org/en}, 2008.

\bibitem{dlspec} IEC 61158-4-12: Data-link Protocol Specification.
International Electrotechnical Commission (IEC), 2005.

\bibitem{alspec} IEC 61158-6-12: Application Layer Protocol Specification.
International Electrotechnical Commission (IEC), 2005.

\bibitem{gpl} GNU General Public License, Version 2.
\url{http://www.gnu.org/licenses/gpl.txt}. August~9, 2006.

\bibitem{lsb} Linux Standard Base.
\url{http://www.linuxfoundation.org/en/LSB}.  August~9, 2006.

\bibitem{wireshark} Wireshark. \url{http://www.wireshark.org}. 2008.

\bibitem{automata} {\it Hopcroft, J.~E. / Ullman, J.~D.}: Introduction to
Automata Theory, Languages and Computation. Adison-Wesley, Reading,
Mass.~1979.

\bibitem{fsmmis} {\it Wagner, F. / Wolstenholme, P.}: State machine
misunderstandings. In: IEE journal ``Computing and Control Engineering'',
2004.

\bibitem{rtai} RTAI. The RealTime Application Interface for Linux from DIAPM.
\url{http://www.rtai.org}, 2006.

\bibitem{doxygen} Doxygen. Source code documentation generator tool.
\url{http://www.stack.nl/~dimitri/doxygen}, 2008.

\end{thebibliography}

\printnomenclature
\addcontentsline{toc}{chapter}{\nomname}
\markleft{\nomname}

\printindex
\markleft{Index}

%------------------------------------------------------------------------------

\end{document}

%------------------------------------------------------------------------------
author	Florian Pose <fp@igh-essen.com>
	Thu, 14 Aug 2008 15:49:00 +0000
changeset 1203	acb649738601
parent 1202	91d190223daa
child 1204	4e3e8400c338
permissions	-rw-r--r--