documentation/ethercat_doc.tex
branchstable-1.4
changeset 1674 201b4ce689e5
parent 1673 c6f214c9986d
child 1686 e206f4485f60
equal deleted inserted replaced
1673:c6f214c9986d 1674:201b4ce689e5
  2254 Although EtherCAT's timing is highly deterministic and therefore timing issues
  2254 Although EtherCAT's timing is highly deterministic and therefore timing issues
  2255 are rare, there are a few aspects that can (and should be) dealt with.
  2255 are rare, there are a few aspects that can (and should be) dealt with.
  2256 
  2256 
  2257 %------------------------------------------------------------------------------
  2257 %------------------------------------------------------------------------------
  2258 
  2258 
  2259 \subsection{Application Interface Profiling}
  2259 \section{Application Interface Profiling}
  2260 \label{sec:timing-profile}
  2260 \label{sec:profiling}
  2261 \index{Profiling}
  2261 \index{Profiling}
  2262 % FIXME
       
  2263 
  2262 
  2264 One of the most important timing aspects are the execution times of the
  2263 One of the most important timing aspects are the execution times of the
  2265 application interface functions, that are called in cyclic context. These
  2264 application interface functions, that are called in cyclic context. These
  2266 functions make up an important part of the overall timing of the application.
  2265 functions make up an important part of the overall timing of the application.
  2267 To measure the timing of the functions, the following code was used:
  2266 To measure the timing of the functions, the below cyclic code was used:
  2268 
  2267 
  2269 \begin{lstlisting}[gobble=2,language=C]
  2268 \begin{lstlisting}[language=C]
  2270   c0 = get_cycles();
  2269 c0 = get_cycles();
  2271   ecrt_master_receive(master);
  2270 ecrt_master_receive(master);
  2272   c1 = get_cycles();
  2271 c1 = get_cycles();
  2273   ecrt_domain_process(domain1);
  2272 ecrt_domain_process(domain1);
  2274   c2 = get_cycles();
  2273 c2 = get_cycles();
  2275   ecrt_master_run(master);
  2274 ecrt_domain_queue(domain1);
  2276   c3 = get_cycles();
  2275 c3 = get_cycles();
  2277   ecrt_master_send(master);
  2276 ecrt_master_send(master);
  2278   c4 = get_cycles();
  2277 c4 = get_cycles();
  2279 \end{lstlisting}
  2278 \end{lstlisting}
  2280 
  2279 
  2281 Between each call of an interface function, the CPU timestamp counter is read.
  2280 Between each call of an interface function, the CPU timestamp counter is read.
  2282 The counter differences are converted to \micro\second\ with help of the
  2281 The counter differences are converted to \micro\second\ via the
  2283 \lstinline+cpu_khz+ variable, that contains the number of increments per
  2282 \lstinline+cpu_khz+ variable, that contains the number of counts per
  2284 \milli\second.
  2283 \milli\second\ for the IA32 architecture's timestamp counter.
  2285 
  2284 
  2286 For the actual measuring, a system with a \unit{2.0}{\giga\hertz} CPU was used,
  2285 For the actual measurement, a system with a \unit{2.0}{\giga\hertz} CPU was
  2287 that ran the above code in an RTAI thread with a period of
  2286 used, that ran the above code in an RTAI thread with a period of
  2288 \unit{100}{\micro\second}. The measuring was repeated $n = 100$ times and the
  2287 \unit{1}{\milli\second}. The measurement was repeated $n = 10000$ times and
  2289 results were averaged. These can be seen in table~\ref{tab:profile}.
  2288 the results were averaged. These can be seen in table~\ref{tab:profile}.
  2290 
  2289 
  2291 \begin{table}[htpb]
  2290 \begin{table}[htpb]
  2292   \centering
  2291   \centering
  2293   \caption{Profiling of an Application Cycle on a \unit{2.0}{\giga\hertz}
  2292   \caption{Application Cycle on a \unit{2.0}{\giga\hertz} Processor}
  2294   Processor}
       
  2295   \label{tab:profile}
  2293   \label{tab:profile}
  2296   \vspace{2mm}
  2294   \vspace{2mm}
  2297   \begin{tabular}{l|r|r}
  2295   \begin{tabular}{l|r|r}
  2298     Element & Mean Duration [\second] & Standard Deviancy [\micro\second] \\
  2296 
       
  2297     Function &
       
  2298     $\mu(\Delta t)$ [\micro\second] &
       
  2299     $\sigma(\Delta t)$ [\micro\second] \\
  2299     \hline
  2300     \hline
  2300     \textit{ecrt\_master\_receive()} & 8.04 & 0.48\\
  2301 
  2301     \textit{ecrt\_domain\_process()} & 0.14 & 0.03\\
  2302     \lstinline+ecrt_master_receive()+ & 6.13 & 1.11\\
  2302     \textit{ecrt\_master\_run()} & 0.29 & 0.12\\
  2303 
  2303     \textit{ecrt\_master\_send()} & 2.18 & 0.17\\ \hline
  2304     \lstinline+ecrt_domain_process()+ & $<$ 0.01 & 0.07\\
  2304     Complete Cycle & 10.65 & 0.69\\ \hline
  2305 
       
  2306     \lstinline+ecrt_domain_queue()+ & $<$ 0.01 & 0.17\\
       
  2307 
       
  2308     \lstinline+ecrt_master_send()+ & 1.15 & 0.65\\ \hline
       
  2309 
       
  2310     Complete Cycle & 7.28 & 1.31\\ \hline
       
  2311 
  2305   \end{tabular}
  2312   \end{tabular}
  2306 \end{table}
  2313 \end{table}
  2307 
  2314 
  2308 It is obvious, that the functions accessing hardware make up the
  2315 It is obvious, that the functions accessing hardware make up the lion's share.
  2309 lion's share. The \textit{ec\_master\_receive()} executes the ISR of
  2316 The \lstinline+ec_master_receive()+ executes the ISR of the Ethernet device
  2310 the Ethernet device, analyzes datagrams and copies their contents into
  2317 driver, dissects the received frame and copies the datagram contents into the
  2311 the memory of the datagram objects. The \textit{ec\_master\_send()}
  2318 memory of the corresponding datagram objects. The \lstinline+ec_master_send()+
  2312 assembles a frame out of different datagrams and copies it to the
  2319 function assembles a frame from different datagrams and copies it to the
  2313 hardware buffers. Interestingly, this makes up only a quarter of the
  2320 hardware buffers. The functions that only operate on the masters internal data
  2314 receiving time.
  2321 structures are very fast ($\Delta t < \unit{1}{\micro\second}$).
  2315 
  2322 
  2316 The functions that only operate on the masters internal data structures are
  2323 For a realtime cycle makes up about \unit{10}{\micro\second}, the resulting
  2317 very fast ($\Delta t < \unit{1}{\micro\second}$). Interestingly the runtime of
  2324 theoretical frequency could be up to $1 / \unit{10}{\micro\second} =
  2318 \textit{ec\_domain\_process()} has a small standard deviancy relative to the
  2325 \unit{100}{\kilo\hertz}$. For two reasons, this frequency keeps being
  2319 mean value, while this ratio is about twice as big for
  2326 theoretical:
  2320 \textit{ec\_master\_run()}: This probably results from the latter function
       
  2321 having to execute code depending on the current state and the different state
       
  2322 functions are more or less complex.
       
  2323 
       
  2324 For a realtime cycle makes up about \unit{10}{\micro\second}, the theoretical
       
  2325 frequency can be up to \unit{100}{\kilo\hertz}. For two reasons, this frequency
       
  2326 keeps being theoretical:
       
  2327 
  2327 
  2328 \begin{enumerate}
  2328 \begin{enumerate}
  2329 
  2329 
  2330 \item The processor must still be able to run the operating system between the
  2330 \item The processor must still be able to run the operating system between the
  2331 realtime cycles.
  2331 realtime cycles.
  2336 
  2336 
  2337 \end{enumerate}
  2337 \end{enumerate}
  2338 
  2338 
  2339 %------------------------------------------------------------------------------
  2339 %------------------------------------------------------------------------------
  2340 
  2340 
  2341 \subsection{Bus Cycle Measuring}
  2341 \section{Bus Cycle Measurement}
  2342 \label{sec:timing-bus}
  2342 \label{sec:timing-bus}
  2343 \index{Bus cycle}
  2343 \index{Bus cycle}
  2344 
  2344 
  2345 For measuring the time, a frame is ``on the wire'', two timestamps must be
  2345 For measurement the time, a frame is ``on the wire'', two timestamps must be
  2346 taken:
  2346 taken:
  2347 
  2347 
  2348 \begin{enumerate}
  2348 \begin{enumerate}
  2349 
  2349 
  2350 \item The time, the Ethernet hardware begins with physically sending the
  2350 \item The time, the Ethernet hardware begins with physically sending the
  2355 \end{enumerate}
  2355 \end{enumerate}
  2356 
  2356 
  2357 Both times are difficult to determine. The first reason is, that the
  2357 Both times are difficult to determine. The first reason is, that the
  2358 interrupts are disabled and the master is not notified, when a frame is sent
  2358 interrupts are disabled and the master is not notified, when a frame is sent
  2359 or received (polling would distort the results). The second reason is, that
  2359 or received (polling would distort the results). The second reason is, that
  2360 even with interrupts enabled, the time from the event to the notification is
  2360 even with interrupts enabled, the interrupt latency (i.\,e.\ the time from the
  2361 unknown. Therefore the only way to confidently determine the bus cycle time is
  2361 event to the notification) is unknown. Therefore the only way to confidently
  2362 an electrical measuring.
  2362 determine the bus cycle time is an electrical measurement.
  2363 
  2363 
  2364 Anyway, the bus cycle time is an important factor when designing realtime
  2364 Anyway, the bus cycle time is an important factor when designing realtime
  2365 code, because it limits the maximum frequency for the cyclic task of the
  2365 applications, because it limits the maximum frequency for the cyclic task. In
  2366 application.  In practice, these timing parameters are highly dependent on the
  2366 practice, these timing parameters are highly dependent on the hardware and
  2367 hardware and often a trial and error method must be used to determine the
  2367 often a trial and error method must be used to determine the limits of the
  2368 limits of the system.
  2368 system.
  2369 
  2369 
  2370 The central question is: What happens, if the cycle frequency is too high? The
  2370 An essential question is: What happens, if the cycle frequency is too high?
  2371 answer is, that the EtherCAT frames that have been sent at the end of the
  2371 The EtherCAT frames that have been sent at the end of the cycle could have
  2372 cycle are not yet received, when the next cycle starts.  First this is noticed
  2372 been not yet received when the next cycle starts. First this is noticed by the
  2373 by \textit{ecrt\_domain\_process()}, because the working counter of the
  2373 domain, because the working counters of the datagrams are zero. This can be
  2374 process data datagrams were not increased. The function will notify the user
  2374 queried in realtime context via the application interface and is output via
  2375 via Syslog\footnote{To limit Syslog output, a mechanism has been implemented,
  2375 Syslog\footnote{To limit Syslog output, a mechanism has been implemented, that
  2376 that outputs a summarized notification at maximum once a second.}. In this
  2376 outputs a summarized notification at maximum once a second.}. In this case,
  2377 case, the process data keeps being the same as in the last cycle, because it
  2377 the process data keeps being the same as in the last cycle, because it is not
  2378 is not erased by the domain. When the domain datagrams are queued again, the
  2378 erased by the domain. When the domain datagrams are queued again, the master
  2379 master notices, that they are already queued (and marked as sent). The master
  2379 notices, that they are already queued (and marked as sent). The master will
  2380 will mark them as unsent again and output a warning, that datagrams were
  2380 mark them as unsent again and output a warning, that datagrams were
  2381 ``skipped''.
  2381 ``skipped''.
  2382 
  2382 
  2383 On the mentioned \unit{2.0}{\giga\hertz} system, the possible cycle frequency
  2383 On the mentioned \unit{2.0}{\giga\hertz} system, the possible cycle frequency
  2384 can be up to \unit{25}{\kilo\hertz} without skipped frames. This value can
  2384 can be up to \unit{25}{\kilo\hertz} without skipped frames. This value is
  2385 surely be increased by choosing faster hardware. Especially the RealTek
  2385 highly dependant on the chosen hardware. 
  2386 network hardware could be replaced by a faster one. Besides, implementing a
       
  2387 dedicated ISR for EtherCAT devices would also contribute to increasing the
       
  2388 latency. These are two points on the author's to-do list.
       
  2389 
  2386 
  2390 %------------------------------------------------------------------------------
  2387 %------------------------------------------------------------------------------
  2391 
  2388 
  2392 \chapter{Installation}
  2389 \chapter{Installation}
  2393 \label{sec:installation}
  2390 \label{sec:installation}