Communication Networks Basics

    Introduction

    Introduction: Classification and Disclaimer

    • In a nutshell, this article is about some of the underlying technology required for the internet as network of networks and other networks.
    • This is intended to be summarizing the lecture “Communication Networks” given by Prof. Petri Mähönen (winter semester 2016/2017).
    • Despite quality assuring efforts, I do not guarantee correctness, completeness, or currency. This may be helpful as a supplement.
    • I can heartily recommend the book “Computer Networks” by Andrew S. Tanenbaum. In fact, a lot of insights presented here are explained there in a lot more detail. When referring to chapters or pages I’m using the original 4th edition version in English.
    • Other than the book my main source are the non-public slides and exercises of the lecture. For copyright reasons, this is a thoroughly digested, summarized and altered version of the mentioned. The script “Kommunikationstechnik” by Prof. Peter Jax also touches some aspects of this lecture. Unfortunately, it also is not publically available and is written in German.
    • You can find a referral link to the book in the sidebar.

    Introduction: Abstract

    In this summary or introduction a lot of technologies, mostly protocols will be presented without going into too much detail. However, let me quickly draw a little picture of how come that are so many different protocols, standards, and devices. On the one hand, there are certain economic, personal and social influences, as well as scientific progress in numerous fields. On the other hand, there are a lot of different use cases in education, research, industry, commercial and private environments. This has lead to networks of different dimensions, topologies, hardware and environmental parameters. Lastly, incredibly scalable networking solutions and a lot of trickery have made possible the rise of a very special global network: the internet (short for internetwork). Let’s take a look.

    Due to their central importance: a protocol figuratively speaking refers to a set of common rules of how to conduct a conversation or generalized to go about a problem. A fruitful conversation can only be conducted if both parties agree to a protocol, which is why many protocols are part of industrial standards. In our cases, protocol rules may include definitions of data representation, maximum frame size, information encoding and much more. In later sections, they will be seen from a different perspective as the implementation of an OSI layer service.

    Introduction: Network Categorization

    The simplest network you could think of would be two connected computers. This would be a point-to-point link and due to the fact that there is exactly one sender and one receiver this sometimes is called unicasting. In these kinds of networks, comprising only of point-to-point links, it often times is necessary to get across several nodes to finally reach the destination. Almost self-evidently, this is due to the fact that there cannot exist point-to-point connections between each unique pair of computers in a large network. This will raise the issue of routing in a later section. Point-to-point connections are one of two kinds of transmission technology. The other kind would be broadcasting links, which basically means that multiple computers share a channel. Short pieces of information called packets are received by all computers in that kind of network. When sending only to a subset of the nodes we call it multicasting.

    Another way to differentiate between different networks is to look at their dimensions. Here’s a little table for that matter:

    Networks Categorized By Dimension
    Approximate dimensionAbbreviationName
    1mpersonal area network
    10m – 1kmLANlocal area network
    10kmMANmetropolitan area network
    100-1000kmWANwide area network
    10000kminternet

    Another way of differentiating between networks would be network topology. For brevity’s sake only a brief overview:

    Network Topologies

    Network Software: Layering

    Historically, the first networks were purely relying on hardware. However, for today’s needs, this is not a suitable approach anymore. Today a set of software helps tackle this task. Building a complete system of networking software all by yourself or even by a single group is an extremely hard if not insurmountable task. It is for this reason, that networks nowadays are organized as a stack of layers.

    This stack is often called network stack or just “the stack”. Such a stack consists of $n$ layers. To enable communication between level $n$ layers of two computers there are so-called protocols. Note that this kind of communication is virtual. This means, in reality, the communication does not take part directly between these two layers only some kind of “protocol API” acts if that would happen. What really happens is that the layers can only communicate with layers directly above or beneath them. So layer $n$ can only directly communicate with layer $n+1$ and $n-1$. Below the bottom layer, the physical layer is the physical medium through which the communication really takes place. Arriving as a current on the other side the packet travels up the whole stack again until the responsible layer is reached.

    Network Software: Design Goals

    Before going into further detail about all the layers, let me first present the most important design goals of networking software.

    Checklist: Design Goals for Network Software

    We need proper addressing so that the target computer is reliably reached even across large networks.

    We also need error control, because physical mediums never transmit without errors.

    Flow control takes care of restricting sending speed to prevent senders to recklessly push data into the link despite the receiver for whatever reason can not receive or process these data fast enough.

    For economical or convenience reasons one can decide to use a single connection for multiple unrelated conversations. The processes taking care of this are called (statistical) multiplexing and demultiplexing.

    In a network complex network topology, it is by no means trivial to find an “ideal” or “good” path. That subject is called routing.

    Network Software: Services

    To understand the purpose of different protocols of which are preferable over others in given scenarios we’ll first have to understand which types of services exist and about two fundamental types of data transmission in the physical layer: circuit switching versus packet switching.

    Circuit Switching versus Packet Switching

    Circuit switching: an example for circuit switching is the telephone system. Once the number has been dialed a physical connection to the other side is established. This has the advantage that no congestion can happen, due to the fact that a connection is reserved for the two parties connected. Packets usually arrive in the order they were sent in. However, setting up exclusive connections is potentially wasteful because not every sender is constantly pushing data through the line at all times.

    Packet switching: reminiscent of the postal system. No connection has to be established before sending and each packet has to find its own way to the receiver. This reduces the single point of failure vulnerability: if a non-critical switch dies the packet can be rerouted around the switch. Each packet has a tightly limited maximum size which allows buffering on the router main memory. Since no user can exclusively use the line for more than a few milliseconds, packet-switching networks are well-suited for interactive traffic.

    There’s a third kind of switching: message switching which resembles the outdated telegram system. The main differences compared to packet switching are two-fold: firstly, there is no limit to the packet size; secondly, a mechanism called store-and-forward is always employed. Store-and-forward means that a message is transmitted node by node and is checked for errors each time. Due to the unlimited block size, message switching is useless for modern interactive applications and therefore not used anymore and the reason why it won’t be discussed here in any more detail.

    Let’s draw a little comparison between the first two kinds of switching. A green background indicates a bad a good property, a yellow background indicates situational dependence, lastly a red background indicates a bad property.

    Circuit Switching verus Packet Switching
    PropertyCircuit SwitchedPacket Switched
    Setup neededrequiredno
    Dedicated physical pathyesno
    Each package follows the same routeyesno
    Packets arrive in orderyesno
    Bandwidth availablefixeddynamic
    Time of possible congestionat setupon every packet transmission
    Potential waste of bandwidthyesno
    Store-and-forward possiblenoyes
    Transparencyyesno
    Chargingper minuteper packet

    Connection-Oriented versus Connectionless Services

    The following paragraphs are key to understanding the technology underlying the Internet and other so-called packet-switched networks. Services provided by networks more precisely their protocols can be divided into two kinds. In case you’re confused later: This whole thing takes place in the transport layer.

    Connection-oriented service (COS): A number of preparatory steps, a negotiation, can be necessary before conducting a conversation. In some cases, sender, receiver, and subnet negotiate about parameters, such as maximum message size, quality of service and more. The characterizing aspect of the connection-oriented service is that the information gets pushed into the channel where one side takes out as much as the other side is pushing in. In most cases, the bit order is preserved.

    Connectionless service (CLS): resembles postal system. Each packet carries the full destination address and is routed independently through the system. Most of the time packet order is preserved, but not guaranteed due to delays.

    Reliable versus Unreliable Services

    Both COS and CLS can be attributed a quality of service (QOS). Reliable services never lose data. Reliable services are realized by having the receiver acknowledge the receipt of each package. Acknowledgment techniques will be discussed in greater detail in the data link layer section. Acknowledgments (ACKS) introduce overhead and delays which are annoying but worth it most of the time. However, for some purposes, these delays are unacceptable, e.g. voice over IP (VOIP) or video. For these purposes, little blurs, contortions, and other disturbances are fairly minor compared to delays. These unacknowledged, unreliable services are often called datagram services (referring to telegrams).

    It is for two reasons – unacceptable delays and possible unavailability of reliable means of communication – that both reliable and unreliable services coexist.

    Example: TCP stands for Transfer Control Protocol and is a connection-oriented, acknowledged protocol. UDP stands for User Datagram Protocol and is connectionless and unacknowledged.

    Note: It is important that connection-oriented are not always reliable, i.e. acknowledged and connectionless are not always unreliable. For example, it exists the so-called acknowledged datagram service. A use case for it would be registered email, where an acknowledgment is returned upon receipt.

    Network Software: Reference Models

    The OSI Reference Model versus the TCP/IP Reference Model

    Let’s make a shortcut and agree with the following:

    • OSI stands for Open Systems Interconnection, TCP stands for Transmission Control Protocol, IP stands for Internet Protocol
    • The OSI model shown below has turned out to be a very good model when talking about networks. However, the OSI protocols are flawed and practically never used.
    • It turned out that the presentation and session layer are not, too useful. As a result, we’ll strip these from the model in further considerations.
    • Quite the opposite is true for TCP/IP. Its model is very unclear about the host-to-network layer and cannot be applied to anything else than TCP/IP. However, its protocols are well implemented and very dominant.
    • Following the lecture and Tanenbaum we will use a hybrid solution: we’ll mostly look at the OSI model (minus presentation and session layer) and the TCP/IP protocols.
    • For an overview look at the section below.

    Layer 4-7 are end-to-end that means they’re really about communication of the source and destination machine, whereas layers 1-3 are chained, which means those are communicating with their immediate neighbors in the subnet.

    Where does the mentioned usefulness of the OSI model originate from?

    Firstly, it distinguishes very well between the following terms: service, interface, and protocol. This is not the case for the TCP/IP model.

    Important Definitions Considering Layers
    NameDescription
    InterfaceInput and output parameters of layer function
    ProtocolImplementation of layer function
    PDUProtocol Data Units provide protocol control information as well as the service data unit for the adjacent layer above.
    ServiceDescription of what a layer does (layer function) for the adjacent upper layer.
    SDUService Data Units provide data for the upper layer. Naturally, they are contained within the PDU of the layer providing the service.

    Secondly, it distinguishes between the physical and data link layer, which have completely distinct responsibilities.

    Last but not least: layering is very much in the spirit of object-oriented programming. It is possible to completely overhaul one layer’s implementation without touching any of the others. Let me shamelessly advertise for some of my articles: “C Basics” series, as well as “C++ Basic Concepts”. Links are provided in the sidebar.

    Talking about responsibilities, here’s a little overview of the layers:

    Layer Responsibilities
    LayerResponsibilites
    ApplicationUser application protocols: prominent examples include but are not limited to HTTP, SMTP, FTP and many more
    PresentationConcerned with syntax and semantics of transmitted information. Make communication of computers with different data representation possible.
    SessionToken management (preventing two parties from attempting critical operations simultaneously), dialog control (keeping track whose turn it is to transmit), synchronization (let transmissions recover from crash and continue from checkpoints).
    TransportationAccepts data from above, splits it into smaller chunks if need be. Responsible for efficient transport to other side and isolation of the above layers from hardware evoulution. Furthermore responsible for the choice of service.
    NetworkControls operations of subnet, responsible for routing, congestion control (more generally quality of service), resolving interconnectivity issues of heterogenous networks
    Data LinkResponsible for creating data frames, error control (ACKs), flow control and medium access control.
    PhysicalTransmitting raw bits over a faulty channel. Mostly deals with mechanical, electrical and timing interfaces.


    Summary

    Let’s slightly change the order of what we learned before in this summary to connect things a little bit better:

    Checklist: Introduction

    Networks can be categorized by dimension, topology, and uni/multi/broadcasting.

    To reduce implementation complexity we implement a layered stack.

    The design of this layers must include the following features: error and flow control, addressing, statistical (de)multiplexing and routing.

    In this introduction’s hybrid model we discuss the OSI model and TCP/IP protocols.

    In the physical layer there are two fundamental transmission types:
    Circuit switching creates a full-bandwidth dedicated path between two points for one session.
    Packet switching means that each data packet takes an independent route. This reduces bandwidth waste and risk of single point of failure and evokes bit rate volatility.
    Both types have fundamental up- and downsides.

    Be sure to understand WHY the following is true, if not reread the respective sections:
    Circuit switching is not the same as connection-oriented service.
    Connection-oriented packet switching is often called virtual circuit mode communication.
    Connection-oriented services are not always reliable.
    Packet switched networks often suffer from volatile data transmission rates.

    Reliability is usually realized through acknowledgment.
    Sometimes the delays from ACKs are unacceptable which keeps unreliable services alive.

    Physical Layer

    Apart from distinguishing packet switching versus circuit switching nothing is really mentioned about this layer in the lecture. Therefore, I will only quickly outline a few aspects of what the physical layer is all about:

    • theoretical (mathematical, physical, material) basis for data transmission
    • transmission media: e.g. guided (copper wire and fiber optics), wireless (terrestrial radio) and satellite.
    • electromagnetic waves: (dis)advantges of different wavelengths
    • practical data transmission problems: attenuation, distortion, noise
    • AC information coding: amplitude, frequency and phase modulation
    • data transmission direction: simplex, half duplex, full duplex
    • multiplexing: frequency/time division multiplexing (FDM/TDM) and code divsion multiple access (CDMA)
    • switching: circuit, message and packet switching
    • actual commuincation systems: SONET, DSL, GSM…

    For further information about this, I refer to chapter 2 of Tanenbaum, as well as the script of the RWTH lecture “Kommunikationstechnik” by Prof. Peter Jax.

    Data Link Layer

    The data link layer is a very thick layer, i.e. it has a lot of responsibilities. For this matter, it has been split into two sublayers: the logical link control (upper) sublayer (LLC) and the medium access control (lower) sublayer (MAC).

    LLC responsibilities:

    • mutliplexing algorithms to allow different network layer protocols to coexist
    • flow control
    • error control (ACKs, NAKs, ARQ, CRC)

    MAC responsibilities:

    • frame delimiting and recognition
    • channel access control mechanisms

    Note: The data link layer is discussed in a different order in the lecture, as well as in the Tanenbaum. There framing is discussed first, followed by flow and error control and only then medium access control is discussed. This order makes sense didactically. However, this summary shall be structured as the hybrid model introduced earlier.

    MAC Sublayer

    Framing

    What is framing about?

    • The physical layer service provides a bitstream to the MAC layer.
    • This bitstream is not error free: it can lack or have additional bits and some bits might be flipped.
    • Framing is about splitting a bitstream provided by the physical layer service into discrete frames.
    • The checksum of these frames is calculated and added to the frame.
    • Then it is compared to the recomputed checksum on arrival at the destination.
    • If the two checksums differ an error has occurred and the error can be handled (drop frame and possibly request new frame).
    • In short, we frame for two reasons: realizing the concept of packets and to provide a simple mean of error-detection and possibly even error-correction.

    Different approaches to framing

    Possible approaches to framing discussed here include:

    • Character counting
    • Timing
    • Starting and ending flags

    Character counting

    Idea: Each frame begins with the number of succeeding bits that belong to that frame.

    Advantage: very simple implementation

    Disadvantage: once a single bit of the bit counts is corrupted the error persists forever = no resynchronization

    Timing

    Idea: sender and receiver have synchronized clocks, frame delimiting is based on timestamps

    Advantage: no overhead

    Disadvantage: it is hard to guarantee clock synchronization

    Starting and ending flags

    Idea: Put bit patterns (flags) at the beginning and end of each frame

    Byte or character stuffing: the flag is often an eight-bit pattern, called flag byte. Should the flag byte appear in the payload it is simply escaped by an escape byte. Should the escape byte appear in the payload it is escaped by another escape byte. The only real disadvantage is that this approach is closely tied to an eight-bit character representation. Not all character codes use eight bits to encode characters, which gives rise to bit stuffing.

    Bit stuffing: Let each frame begin and end with the flag byte 01111110. Each time the sender’s link layer encounters a sequence of five 1s it stuffs one 0 behind it. The receiver reverses this process. This simple improvement now supports variable length character encoding.

    Advantage: reliable and easy to implement

    Disadvantage: stuffing required

    Bit stuffing example with 01111110 flags
    original data111000111110111111111111111111001001110111101
    sent data (added 0 after each sequence of five consecutive 1s)1110001111100111110111110111110111001001110111101
    recronstructed data (replaced each sequence of 111110 with 11111)111000111110111111111111111111001001110111101

    Medium Access Control

    Since physically only one, or less strictly speaking a limited amount of connections can be established through a channel at any given moment we have to control the access to the physical medium. If multiple senders push their data into the channel at the same time what we get is a collision, which leaves both parties unhappy. As in a car crash, no one will arrive their destination. As always, in the bleak real world, we also have to think about implementation complexity of our solution (which is rather decisive for what is left to pay our salary as engineers).

    We will look at the following strategies:

    • We don’t (really) care about this problem and only retransmit on collisions (simple, but quite inefficient).
    • We try to avoid collisions economically (collisions still occur “rarely”, usually ( = not always) economically optimal).
    • We make sure no collisions occur (inefficient or unstable or high implementation complexity).

    this corresponds to the following implementations (= protocols):

    • Pure ALOHA, Slotted ALOHA
    • CSMA family, Ethernet (fist modified implementation of CSMA)
    • TDMA, CDMA, FDMA, Binary Countdown

    Pure ALOHA

    ALOHA is a Hawaiian greeting. In fact, it was the first wireless MAC-Layer protocol provided by professor Abrahamson of the University of Hawaii. It is a simplistic protocol that works well when the channel load $G$ is low. In fact it reaches its peak efficiency at $G = 0.5$. First things first, though, we will first qualitatively introduce the ALOHA-algorithm and then properly define the channel load $G$ and discuss ALOHA’s efficiency for various values of $G$.

    ALOHA Pseudo Code

    It really is that simple. No second bottoms.

    void send(packet_t& packet)
    {
        transmit(packet);
        if(collision())
        {
            wait(random_time());
            send(packet);
        }
    }
    
    int main(void)
    {
        while(true)
        {
            packet_t& packet= packet_to_send();
            if(packet)
               send(packet);
        }
    }
    

    Basically, the protocol says: send whenever you want to, if there’s a collision just wait a random time before sending again. As you can imagine this protocol is perfectly fine if the channel load is very low: if there are only one or a few transmitters that very rarely send anything, then there won’t be any collisions. The bright side is the ease of implementation and no overhead. The bleak side obviously if the channel load is very high, there will be only collisions and nobody reaches their target.

    Define the channel load $G$ as the ratio of $\frac{\text{requests}}{\text{slot}}$, where slot is a time unit. Obviously, if there are only one or a few transmissions in a very long time then you will be perfectly fine.

    Let’s calculate the estimated throughput $S$, which is defined as $\frac{\text{successful transmissions}}{\text{slot}}$. $S$ naturally is anywhere between 0 and 1, if the $t_{\text{transmission}} = t_{\text{slot}}$, which we will assume here and in all later examples if not stated differently.

    First of all, we will assume that all packets are Poisson-distributed, which implies $P(\text{k packets generated in given time frame}) = \frac{G^k}{k!} \cdot e^{-G}$.

    Frames transmitted with ALOHA are vulnerable for two entire slots because they are can be sent off at any moment in one slot and no other frame must be sent off for another entire slot.

    So the probability $p_0$ that no other packet is generated for the duration of two slots is:

    $$p_0 = \frac{2G^0}{0!} \cdot e^{-2G} = e^{-2G} $$

    If we think about the definition of $S$ the following makes sense:

    $$ S_{\text{pure ALOHA}} = Gp_0 = Ge^{-2G} $$

    To maximize $S$ the condition $\frac{\partial{S}}{\partial{G}} = 0$ has to be met. Simple computation reveals $G$ has to be $\frac{1}{2}$, which we actually knew before, because ALOHA is vulnerable for two slots, so if we send a packet every two slots the maximum efficiency is realized.

    $$ S_{\text{pure ALOHA, max}} = \frac{1}{2} p_0 = \frac{1}{2}e^{-1} = \frac{1}{2e} \approx 0.18 $$

    Slotted ALOHA

    Slotted ALOHA Pseudo Code

    void send(packet_t& packet, time_t current_time)
    {
        if(slot == current_time)
            transmit(packet);
    
        if(collision())
        {
            wait(random_time());
            send(packet);
        }
    }
    
    int main(void)
    {
        while(true)
        {
            slot_t& slot = slot();
            packet_t& packet = packet_to_send();
            if(packet)
               send(packet, time(NOW), slot);
        }
    }
    

    To compute the slotted ALOHA maximum throughput we can do it exactly like for pure ALOHA. However, the vulnerable period is only one slot, because you can only send off a packet at the beginning of each slot, which introduces short delays.

    $$ S_{\text{slotted ALOHA, max}} = G_{\text{max}}p_0 = G_{\text{max}}e^{-G_{\text{max}}} = \frac{1}{e} \approx 0.36 $$

    You may wonder why nothing happened in the power of the exponential function. This is because $G_{\text{max}} = 1$ for slotted ALOHA, which totally makes sense right?

    CSMA

    CSMA stands for Carrier Sense Multiple Access. Using a protocol from this family of protocols is likely to increase the throughput compared to ALOHA. We will also discuss when this is not the case. The main problem with ALOHA was with collisions. Every time a collision occurs a lot of time is lost.

    What CSMA does better is that the nodes of the network listen to the medium or channel and only if it is free they will begin to send packets. However, CSMA is not collision free, as we will see later. As in ALOHA, no collision recovery has been added.

    The different flavors of CSMA

    CSMA in this lecture comes in three different flavors: 1-persistent, Non-persistent, and P-persistent CSMA.

    1-persistent CSMA

    1-persistent CSMA is quite greedy: It listens to the channel all the time and when it is free it immediately starts to transmit with the probability 1. When a collision is detected it waits for a random time and tries to transmit again. The reason for the random wait instead of a fixed period of waiting is to desynchronize with contending stations.

    1-persistent CSMA Pseudo Code:

    void send(packet_t& packet)
    {
        bool busy;
        
        do 
        {
            busy = carrier_sense(radio_on);
        }
        while(busy);
        transmit(packet);
        if (collision())
        {
            wait(random_time());
            send(packet);
        } 
    }
    

    Non-persistent CSMA

    Non-persistent CSMA is a lot less greedy than 1-persistent CSMA because it only listens to channel in random intervals, instead of persistently listening to it. This reduces the number of collisions but increases delays. Overall channel usage is quite likely to be improved.

    Non-persistent CSMA Pseudo Code:

    void send(packet_t& packet)
    {   
        if (carrier_sense(radio_on))
        {
            wait(random_time());
            send(packet);
        } 
        transmit(packet);
        if (collision())
        {
            wait(random_time);
            send(packet);
        } 
    }
    

    P-persistent CSMA

    P-persistent CSMA is a way in between of 1-persitent CSMA and Non-persistent CSMA. The difference between 1-persistent and P-persistent CSMA is that P-persistent CSMA sends the packet with the probability P instead of 1 when the channel is free. Thus, 1-persistent CSMA actually is a special case of P-persistent CSMA.

    P-persistent CSMA Pseudo Code:

    #include <stdlib.h>
    #include <time.h>
    
    // arbitrary P between 0 and 1; 
    // This would imply 0.5-persistent CSMA:
    #define P 0.5
    
    void send(packet_t& packet)
    {
        srand(time(NULL));
        
        // generate random number between 0 and 1
        double random = ( (double) rand() / RAND_MAX);
    
        bool busy;
        
        do 
        {
            busy = carrier_sense(radio_on);
        }
        while(busy);
    
        // with probability p transmit package;
        if(random < P) 
        {
            transmit(packet);
        }
        else
        {
            send(packet);
        }
    
        if (collision())
        {
            wait(random_time());
            send(packet);
        } 
    }
    

    CSMA/CD

    CSMA/CD stands for CSMA collision detection. CSMA/CD adds the following functionality to CSMA: whenever a collision is detected the “losing” station cancels their transmission and sends a short jamming signal. After that, it waits a random time (“backs off”) before trying to send again.

    This is a feature that can be (de facto is) applied to all flavors of CSMA. However, this comes at a price: as a consequence of propagation delay packets must have a minimum packet size. Alternatively, the maximum network distance has to be below a certain threshold. Qualitative computations and further consequences are discussed in the following paragraphs.

    So how can collisions still occur?

    Every physical link has a length and propagation speed, which is limited to the speed of light. Thus, if a station A starts to send at $t_0$, station B only notices this at $t_0 + t_{\text{prop}}$. In the worst case, B starts sending just before that time, immediately notices a collision and sends a jamming signal which reaches A at $t_0+2t_{\text{prop}}$.

    If now the packet size of A is smaller than $2\cdot t_{\text{prop}} \cdot R $, where $R$ is the transmission rate, then A will falsely assume that no collision occured.

    Therefore, we must abide by one of two equivalent conditions:

    • The packet size $L$ meets: $ L > 2\cdot t_{\text{prop}} \cdot R$.
    • The maximum network distance $d$ meets: $ d < \frac{Ls}{2t} $, where $s$ is the propagation speed, i.e. a fraction of the speed of light.

    Efficiency of CSMA/CD

    In order to assess a thing such as the efficiency of CSMA/CD, we will make some assumptions to simplify our calculation. We will disregard Non-persistent CSMA/CD, because we don’t want to make the concession of unnecessarily high delays. As a result, we will have to determine an optimal P for P-persistent CSMA/CD. Eventually, we will compute the efficiency for this optimal P.

    Assumptions:

    • time is slotted
    • packet-size is a constant
    • optimal probability P

    The efficiency is the ratio between time usefully spent and the total time: $\eta = \frac{t_{\text{frame}}}{t_{\text{total}}}$, where $t_{\text{total}} = t_{\text{frame}} + t_{\text{overhead}}$. Let’s determine the overhead then!

    Ultimately, we can identify the three states of the CSMA/CD protocols: the (frame) sending phase (useful, hooray!), the contention phase (boo!) and possibly idle phases if nobody has got anything to say.

    Say $p_1$ is the probability that exactly one out of $N$ nodes is trying to send in a given time slot. Then:

    $$ p_1 = \binom{N}{1} \cdot P^1 \cdot (1-P)^{N-1} $$

    The probability that it takes exactly $j$ slots to yields $\frac{1}{p_1}$:

    $$ \sum_{j=0}^{\inf} {p_1}^1 \cdot {p_1}^N-1 = 1/p_1 $$

    This result is computed by derivation and substitution.

    The overall efficiency computes to:

    $ \eta_{\text{CSMA/CD}} = \frac{1}{1+3a} $, where $ a := \frac{t_{\text{prop}}}{t_{\text{frame}}} $

    However, note that simulations with more exact models state that:

    $ \eta_{\text{CSMA/CD}} = \frac{1}{1+6.44a} $

    LLC Sublayer

    Multiplexing

    Multiplexing algorithms to make IP, IPX, Appletalk and more to coexist within the network are beyond the scope of the lecture and therefore ignored in this summary.

    Error Control

    Now that what know what framing works, the next problem arises. For reliable, connection-oriented services (reliable ≠ connection-oriented) we need all frames to arrive and in proper order. Things to keep in mind doing that:

    • Let the receiver give feedback about the arriving frames, be it negative or positive acknowledgment (NAKs/ACKs).
    • Should either a frame or an acknowledgment get lost during the transmission the frame just gets retransmitted.
    • Thinking about the process: what happens if an acknowledgment gets lost?
    • Since the receiver has no way to distinguish between a new and a retransmitted frame he would take the retransmission for a new frame.
    • The obvious solution to this problem is adding sequence numbers to the frames.
    • Another possible scenario: a frame gets lost during the original transmission due to a noise burst.
    • Solution: start a timer when sending frames. Cancel the timer when an acknowledgment arrives. Retransmit in case of NAK or timer overflow.

    We just talked about entire frames getting lost. However, fiber optics and especially wireless transmission are error-prone. There are two basic approaches to deal with it:

    • Error-correcting coding: add sufficient redundancy to the data so that the original data can be restored
    • Error-detecting coding: add sufficient redundancy to the data so that corrupted data is detected. Then, the frame is dropped and possibly an automatic repeat request (ARQ) is sent to try again.

    Which one is used is again an economical optimization problem: one the one hand we want to add as less redundancy as possible to the data, but on the other hand in case of an error, we don’t want to retransmit entire frames.

    In general, we can surmise that “reliable” channels rarely produce errors. For an occasional error, error-detection is good enough. However, for unreliable channels such as wireless transmission channels we have so many transmission errors that it is more efficient to use error-correcting codes with high redundancy.

    A code capable of doing both is the Hamming code. In fact, we can even have hybrid error-detection and error-correction, e.g. Hamming codes correcting one-bit errors and detecting up to three-bit errors (all per codeword). A detailed description can be found in chapter 6 of Kommunikationstechnik.

    Let’s continue as follows: let’s assume an error is detected but cannot, for whatever reason, be corrected. Then we will use Automatic Repeat Requests (ARQ) to request a retransmission.

    Note: the lecture and the exercise are not uniform when it comes to which sequence number is transmitted with the ACK. The lecture slides state it should be the sequence number of the next frame. The exercise states it should be the sequence number of the transmitted frame. Since I assume the guys for the exercise evaluate our exams I’ll go with the exercise notation. According to Julian Arnold both notations are fine:

    Automatic Repeat Requests (ARQs)

    Stop-and-Wait:

    The first ARQ presented here is Stop-and-Wait it is very simple, but has very limited efficiency.

    A few notes:

    • A frame is sent and if ACK doesn’t return before timer expires, then frame is sent again
    • At least one bit sequence numbers required.
    • Efficiency is limited by delay-bandwidth product.
    • The delay-bandwidth product measures the bit-capacity of the link.

    Here’s a diagram:

    {{ Stop-and-Wait ARQ }}

    Without derivation we believe:

    $$ \eta_\text{SW} = \frac{1-\overbrace{\frac{n_o}{n_f}}^{\text{frame overhead}} }{1+\underbrace{\frac{n_a}{n_f}}_{ACKs}+\underbrace{\frac{2(t_\text{prop} + t_\text{proc})R}{n_f}}_{\text{delay-bandwidth} \Pi}} \cdot \overbrace{(1-P_f)}^{\text{frame loss}} $$

    where $t_0 = 2 \cdot t_{\text{prop}} + 2 \cdot t_{\text{proc}} + t_{\text{ack}} + t_{\text{frame}} $ and $n_x$ is the number of bits in (the part of the) frame $x$.

    The dominant part, thus the main problem of this equation is the delay-bandwidth product. A simple improvement over this protocol is to only send a cumulative ACK for each $N$ frames.

    Go-Back-N:

    This leads to Go-Back-N ARQ. Cumulative ACKs to reduce redundant ACKs for more reliable channels:

    Again a few points:

    • In practice sender and receiver agree on $N$.
    • $N$ is called windows size $W_s$ of this so-called sliding window protocol.
    • The name derives from the fact that both sender and transmitter effectively have a window of frames with given sequence numbers.
    • Sequence numbers must be of the same dimension as the window size.

    {{ Go-Back-N ARQ }}

    Without derivation we believe:

    $$ \eta_\text{GBN} = \frac{1-\overbrace{\frac{n_o}{n_f}}^{\text{frame overhead}} }{1+\underbrace{(W_s-1) P_f}_{\text{GBN share}}} \cdot \overbrace{(1-P_f)}^{\text{frame loss}} $$

    This protocol is very efficient if the channel is error-free and $W_s$ is optimally chosen dependent on the delay-bandwidth product.
    However, the Achilles’ heel of this protocol is its susceptibility to error-prone channels. Already for bit error rates of $10^{-4}$ it has efficiencies in the one-digit percent range for ideal $W_s$.

    Let’s elaborate on this protocol. Where does its susceptibility to errors come from? The reason is that each time an error occurs, $N$ frames are retransmitted instead of just the erroneous one. It would be much more efficient in the case of faulty transmission to specify which frame to retransmit. This can be achieved by (selective) negative acknowledgment (NAKs).

    Selective Repeat:

    Effectively getting rid of the Go-Back-N share of this equation (by only retransmitting one frame):

    $$ \eta_\text{SR} = \left( 1-\overbrace{\frac{n_o}{n_f}}^{\text{frame overhead}} \right) \cdot \overbrace{(1-P_f)}^{\text{frame loss}} $$

    {{ Selective Repeat ARQ }}

    Piggybacking:

    Delaying cumulative ACKs by packing them on data frames is called piggybacking. This, of course, is assuming both parties have sufficient data to send to each other. One could ask (in the exercise perhaps) if this is a smart thing to do.

    The answer provided the above assumption is true, is not always.

    Cases in which piggybacking is not a smart thing to do:

    • Data frames have a much lower reliability compared to ACKs (data frames are large and bit error rate is high).
    • Sender window size is too small compared to the effective delay-bandwidth product (window size is small, delay-bandwidth product is high).

    Error-correcting coding: Hamming code

    The Hamming code is one possible and easy to understand code, for both error-detection and error-correction (not the only though).

    Let’s introduce the basis for the Hamming code: the Hamming distance $d$, which is the minimum (!) number of bit flips to get another valid codeword given a valid codeword across the whole code. In other words, at least $d$ single bit errors are required to get another valid codeword, given a valid codeword.

    Flow Control

    Network Layer

    Transport Layer

    Application Layer

    Conclusion

    $ \newcommand{\norm}[1]{\left\lVert#1\right\rVert} $