Wednesday, September 29, 2010

Lesson 15 - VLANs Overview

At this stage you should be familiar with the concepts related to TCP/IP traffic flow and switch operation. You should also feel confident about how to diagnose basic layer 2 connectivity issues. For the details please review my previous posts. In this one, I am going to extend your understanding of layer 2 technologies by introducing Virtual LANs (VLANs).

Before I introduce our main topic let's define the problem which VLANS address first. This way, it's going to be easier to understand them.

Problem With Switching
As you remember from previous lessons, each port of a switch creates its own collision domain (for details look at lesson 9 in this tutorial). In addition to that a switch can use FULL DUPLEX connectivity when connecting other devices to its ports (computers, printers, switches, routers). That allows the ports to SEND and RECEIVE streams of bits at the SAME time. This is due to the special design of a switch. Thus, the efficiency of transmission is radically increased when compared to its older cousin a hub using half-duplex connections (sending or receiving but not both at the same time).

However, switches still maintain ONE BROADCAST DOMAIN. This means that in some situations they flood frames out of all active interfaces except the one that receives the frame. The flooding occurs if either of these are true:
  1. The destination MAC address of the arriving frame is unknown.
  2. The destination MAC address of the arriving frame is broadcast.
  3. The destination MAC address of the arriving frame is multicast.
  4. A switch reaches its limit of MAC addresses learned on a port. Then all other MAC addresses can no longer be learned.
Pic. 1 - Switches maintain one broadcast domain (bottom left computer sends broadcast).
Icons designed by: Andrzej Szoblik - http://www.newo.pl

In a flat network like the one depicted above (Pic. 1), imagine a thousand computers sending broadcast traffic (e.g. ARP requests). They will be propagated everywhere as per rules described earlier. Imagine another situation in which a broken NIC (Network Interface Card =  Network Adapter) sends thousands of broadcast frames per second. Those will be flooded to all hosts interrupting them as they need to process broadcast frames. In those situations not only do we interrupt all hosts by sending frames to them, but also saturate links with garbage data unnecessarily. Why would my computer have to listen to broadcast traffic sent by HR server if I work in IT department? I do not use HR server's resources at all. Exactly!

VLANs Are Broadcast Domains
Virtual LANs are the method of creating multiple broadcast domains of smaller size in a switching infrastructure. They are commonly used solution to the above mentioned problems. By configuring VLANs on the switches you create multiple broadcast domains which are treated as separate, isolated LANs which CANNOT communicate with one another by default. This allows us to contain the broadcast/multicast/unicast traffic WITHIN a boundary of a given VLAN. 

Pic. 2 - VLANs Are Broadcast Domains
Icons designed by: Andrzej Szoblik - http://www.newo.pl

If you consider traffic in the Pic. 2, the computers in red transmit their bits onto the wire, switches will send those only to computers that are in the same VLAN, that is red in this case. For instance, if the bottom right red computer sends layer 2 broadcast (destination MAC address = FFFF.FFFF.FFFF), only computers in red VLAN are going to receive this transmission. Computers located in turquoise VLAN will NOT receive those frames anymore. This way we can segment the traffic between different hosts based on criteria such as groups of interests (workgroups), type of traffic (e.g. VoIP), type of the application used, user location, etc. So, the major benefits of using VLANs are: 
  1. Broadcast/multicast traffic propagation is limited to a given VLAN (broadcast domain) where it originated.
  2. Security is increased, as hosts located in different VLANs CANNOT communicate at all. The only way for them to communicate is to allocate different network/subnet addresses for VLANs and use a layer 3 device (router) to move the packets between them. The routers offer some control as to who can transmit to whom (ACLs, firewalls etc.). How to accomplish routing between VLANs I will explain in my next post.
I hope the above description sheds enough light on what VLANs are used for. Now, is the time to look at some details regarding their configuration.

VLAN Port Types
In order to segment the traffic, the hosts generating it must be assigned to the appropriate VLAN since all ports of the switch are members of VLAN 1 by default. The process of configuring that usually involves three major steps:
  1. Configuring VLAN number in the switch database (optionally name of the VLAN and/or other parameters).
  2. Assigning hosts to VLANs defined in step 1. There are two ways of doing that: either MAC address can be assigned to a VLAN (dynamic method), or port of the switch can be assigned to a VLAN (manual method).
  3. Configuring VLAN Trunk connections between the switches. Even though, this step is optional, the majority of designs out there will need it.
The above mentioned configuration steps define two different port types VLANs can use:
  1. Access Port - this type of port can be member of ONE VLAN ONLY. If a static port-to-vlan configuration is used, the port interprets all incoming frames as belonging to this specific VLAN. In case of using mac-address-to-vlan configuration the port will determine VLAN number (ID) for transmission based on the MAC address which is mapped to a specific VLAN.
  2. Trunk Port - which by default belongs to ALL VLANS (1-4094). In other words, this port is capable of sending and receiving a traffic coming from different VLANs.
When is the trunk (multi VLAN) port required?

The below picture (Pic. 3) illustrates the need for it.

Pic. 3 - VLAN Port Types
Icons designed by: Andrzej Szoblik - http://www.newo.pl

The grey rectangles symbolize two switches. The colors, represents different ports assigned to different VLANs. Of course, VLANs in practice use numbers, not colors, to distinguish between themselves. When any bottom computer sends broadcast (or unicast towards another computer in the same VLAN/color connected to the upper switch), the port connecting the two switches must be trunk (multi-vlan port). In such situation w must allow all VLAN members to communicate with their peers in the same VLAN, irrespective where they are located. Both switches have yellow, red and blue members here! And according to the rules, red computers must be able to talk to all red computers located on the same and all other switches as well (yellow-to-yellow, and blue-to-blue).They are members of the same Virtual LAN after all.

In such design, in which members of the same logical network (VLAN) or broadcast domain are connected to different physical switches, the connection between them must be a trunk. Trunk ports run a special protocol called IEEE 802.1q (Cisco have also their own protocol called ISL, details of which are beyond the scope of this tutorial). This protocol is responsible for 'tagging' the frames (injecting extra information into their headers), while sending them out the trunk port.

Why?

Let me explain. Look carefully at the Pic. 3 and imagine that the computer connected  to yellow VLAN is sending broadcast towards all computers that are in the same, yellow, VLAN. The port between the switches is trunk, and as such allows ALL VLANs in and out. But the problem is that the receiving port on the upper switch gets the Ethernet frame on the port working as trunk as well. So, this port is also a MULTI-VLAN port! How does this upper, receiving, switch know which VLAN the frame is coming from? Well, it does NOT know whether the VLAN sending this broadcast was yellow, red or blue. This is where the sending (bottom) switch, using the trunk as outbound port, is going to inject extra 4 bytes into the Ethernet frame while transmitting it out. The tag will contain VLAN ID (number) of the sender. This way, the broadcast frame will have an extra information allowing the receiving switch (upper one) to recognize which VLAN it is coming from and forward this broadcast to ALL computers in the same VLAN (here yellow VLAN).


NOTICE!
The TAG  is stripped off on the outbound ports configured as ACCESS ones. The tag is useful only on trunk ports.


Before we finish this VLAN overview lesson, let me show you what information this TAG contains.

Pic. 4 - 802.1q TAG

The 802.1q tag is injected between the source MAC address and the type field in the Ethernet II header (pic. 4). It consist of two fields taking two bytes each:
  1. First two byte field contains a signature of 802.1q protocol using value of 0x8100.
  2. Second two byte field  contains:
  • PRI - Class of Service 3 bits used by QoS, 
  • Canonical bit for token ring support, 
  • VLAN ID value that takes up 12 the least significant bits in the tag.
    802.1q Native VLAN
    There is one more thing I need to touch upon that is related to the 802.1q trunk port. That is the concept of Native Vlan. The designers of the protocol decided to send frames coming from so called 'native VLAN' out the trunk as UNTAGGED. In other words this frame does not have any tag inserted into the Ethernet header. So, frame coming from 'native VLAN' is a regular Ethernet frame. As long as the switches agree on the trunk link which VLAN is their 'native VLAN' for this trunk, a frame arriving on the trunk port without the tag is assumed to be coming from the same native VLAN the sender was transmitting. The default  'native VLAN' is VLAN 1, since this one cannot be removed from the switch. Probably the reason VLAN 1 is the 'native VLAN' by default is becuase switches originate frames such CDP, VTP, STP from this VLAN and there is no need to tag them as they are switch-to-switch communication only.


    NOTICE!
    As of the time of writing this tutorial, all ports of Cisco switches belong to VLAN 1 by default which is also the (untagged) 'native vlan'. That VLAN is not going to tag frames on trunk-to-trunk connections.


    I am sure you realize what can happen if the two ports connecting switches use different VLAN ID for their 'native VLAN'. Yes, that can cause leaking frames between VLANs. And this is a serious security issue. So keep the same 'native VLAN' on trunk paired ports between switches.

    In my next post we will look at the same concepts from the command line perspective. I will also introduce VTP protocol as well as Inter-VLAN routing.

    Sunday, September 19, 2010

    Lesson 14 - NTP and Syslog Services

    My previous three posts were a humble attempt to show you some real life networking issues and how to go about them using the skills described so far.

    In this lesson I would like to present two services that are extremely important in management of your switches and routers: Network Time Protocol, and Syslog Services. Even though you will not find them in CCNA curriculum, it is good idea to know what is their role and how to quickly configure them on your devices.

    System Messages
    If you work as a network admin, it is critical that you collect and analyze system messages sent by switches and routers. IOS can send those important messages to the console port 0 by default. You can store them in the switch or router's memory but they will be purged if you have power outage or reboot your device. Also, memory will store as many of them, and then it will begin to overwrite the oldest ones. We need to redirect them to an external server. One of the popular services used to collect system messages is called: Syslog Server. If you are Window user you must probably pay for such server software (although KIWI server used to be freeware, but I don't know if it still free software). Unix and Linux have this service installed by default. All you have to do is to set it up correctly, so it accepts messages from external clients.

    If you want to check how to do it using Ubuntu Linux distribution, please refer to my small Ubuntu notepad at: http://ubuntu-garage.blogspot.com/2010/09/ubuntu-syslog-server.html


    System messages have the different levels of severity as shown below.

    0 - Emergency - System-unusable messages
    1 - Alert - Take immediate action
    2 - Critical - Critical condition
    3 - Error - error message
    4 - Warning - warning message
    5 - Notice - normal but significant condition
    6 - Informational - information message
    7 - Debug - debug messages and log FTP commands and WWW URLs

    As you see, the lower the number the higher severity the level is. I'm sure I don't have to tell you that the levels 0-3 will need your special attention, do I?

    System Logging Message takes the following format:

    timestamp%<facility>-<severity>-<mnemonic>: <message-text>

    Take a look at such message as sent by IOS (Pic. 1)

    Pic. 1 - IOS Syslog Message Example.

    Network Time Protocol (NTP)
    All messages should carry a time stamp. The time of an event allows administrator to see when things went hairy and correlate them with other events that might follow. The problem is that low-end Cisco devices do not keep the date and time like computers do. In order for them to keep the track of time you must either manually set the clock with 'clock' command or synchronize their time with some external sources. The first method is not recommended as after reboot, a router or switch loses its time. That is why the second method is recommended using NTP protocol.

    It is not my intention to give you an in-depth description of NTP and syslog services. Instead, I would like to draw your attention to those services and show you how to set it up quickly.

    NTP server information:

    NTP Server IP = 10.1.1.1
    NTP Password = S3cr3t!!!
    NTP MD5 Key = 1

    Step 1
    Create MD5 key 1 to authenticate with the NTP server.

    R1(config)#ntp authentication-key 1 md5 S3cr3t!!!

    Step 2
    Enable authentication for NTP.

    R1(config)#ntp authenticate

    Step 3
    Tell the router which key our router trusts (we have only one but may use more in the future). We do not want to accidentally synchronize the time with same 'fake' server.

    R1(config)#ntp trusted-key 1

    Step 4
    Finally, configure IP address of the NTP server and specify which key to use for authentication.

    R1(config)#ntp server 10.1.1.1 key 1

    In case you did not use authentication (not recommended), you would be typing in the step 4 line without the 'key 1' argument.


    Verification

    Notice!
    It is recommended that you initially set the clock manually before you allow NTP synchronization. Big time gap between your router and the NTP server clocks, will make synchronization extremely long process.


    Step 1 - Check the status of NTP

    R1#show ntp status

    Clock is synchronized, stratum 5, reference is 10.1.1.1
    nominal freq is 250.0000 Hz, actual freq is 250.0001 Hz, precision is 2**18
    reference time is D04096A6.9715EE2B (14:03:18.590 UTC Sun Sep 19 2010)
    clock offset is -7.9613 msec, root delay is 3.83 msec
    root dispersion is 14.74 msec, peer dispersion is 6.74 msec
    R1#

    Step 2 (optional) - Check NTP association.

    R1#show ntp association

          address         ref clock     st  when  poll reach  delay  offset    disp
    *~10.1.1.1         127.127.7.1       4     9    64  377     5.6    4.22    13.4
     * master (synced), # master (unsynced), + selected, - candidate, ~ configured
    R1#

    Step 3 (optional) - Check NTP association details.

    R1#show ntp association detail

    10.1.1.1 configured, authenticated, our_master, sane, valid, stratum 4
    ref ID 127.127.7.1, time D04097CC.0209500C (14:08:12.007 UTC Sun Sep 19 2010)
    our mode client, peer mode server, our poll intvl 64, peer poll intvl 64
    root delay 0.00 msec, root disp 0.03, reach 377, sync dist 10.239
    delay 7.72 msec, offset 5.1799 msec, dispersion 6.35
    precision 2**24, version 3
    org time D04097E6.97A012CA (14:08:38.592 UTC Sun Sep 19 2010)
    rcv time D04097E6.98D73524 (14:08:38.597 UTC Sun Sep 19 2010)
    xmt time D04097E6.905799B4 (14:08:38.563 UTC Sun Sep 19 2010)
    filtdelay =    33.02    7.72   15.73   22.32    5.65   27.62   23.62   15.66
    filtoffset =   11.77    5.18   13.89   23.08    4.22    7.94   12.65    3.32
    filterror =     0.02    0.99    1.97    2.94    3.92    4.90    5.87    6.85

    Syslog Server Configuration


    Syslog Server Information:
    IP address = 192.168.1.2
    Facility = Local7

    R1 Configuration:

    R1(config)#logging host 192.168.1.2 
    R1(config)#logging facility local7


    From now on, all system messages are going to be sent to syslog server with ip address 192.168.1.2.

    In my next lesson, I'm going to introduce another layer 2 technology: Virtual LANs (VLANs).

    Saturday, September 18, 2010

    Lesson 13 - Layer 2 Connectivity Troubleshooting Part 3

    This lesson is the last one in the series on how to troubleshoot connectivity issues at the layer 2 of OSI model. Bear in mind, that these do not involve layer 2 technologies such as VLANs or Spanning-Tree Protocol, since we have not talked about those yet.

    NOTICE
    The steps presented in this lesson are not the ALL possible diagnostics you can do. And they do not have to be done in this specific order. I am merely listing some logical steps which might be useful in order to 'nail down' the root cause of the problem.


    Trouble Ticket 3
    New installation as per Pic. 1 shows lack of connectivity between the two computers. Initial diagnostics performed revealed the following facts:
    • Switches are connected via fibre optics cable and the ports show proper status (interface is up, line protocol is up).
    • Computers have proper addresses and subnet masks assigned.
    • Cables connecting computers with switches have been tested and proved to be working correctly.
    • Computers reply to echo requests packets (firewalls disabled).
    The technician who set up this new network calls you for help.

    Pic. 1 - New design with connectivity problem
    Icons designed by: Andrzej Szoblik - http://www.newo.pl


    Dealing with this trouble ticket we are going to collect the tools we used in the previous lessons trying to resolve this issue.

    Step 1
    First let's try to 'divide and conquer' (concept mentioned in lesson 11) by sending ping packets from PC1 to PC2. Before we do that though, we need to purge the existing ARP cache on PC1 and PC2 to have a fresh information. We do that by opening Command Line Interface window and typing: arp -d host-address (linux), or arp -d (MS Windows).
    Test results:
    • The pings timed out. No reply from PC2.
    Step 2
    Since the ARP cache entries age out relatively quickly (depending on which operating system you use), we need to quickly check what they contain Alternatively we can send large series of ping packets.
    Test results:
    • PC1 does NOT contain expected mac-to-ip mapping. We expected to see
      192.168.1.2 at 00:1e:4f:b0:b2:fc. It is not there, though.
    • PC2 DOES contain the the right mac-to-ip mapping. It shows the following:
      192.168.1.1 at 00:50:bf:9c:45:6a
      .
    These are very interesting results, don't you think? Before we take the next step let's gather what we know so far.

    Since the ping was initiated by PC1, it sent its ARP request broadcast message and that query must have been delivered to PC2. We can conclude that, based on the fact that we have cleared PC2's ARP cache in step 1, and it has the proper mapping now. PC2 did receive ARP request from PC1 and learned what MAC and IP address it uses. 

    Step 3
    We could omit that step, but we're curious if PC2 replies to the ARP request from PC1. We launch our 'wireshark' tool on PC2, ping again from PC1 to PC2 and capture all packets on PC2. What we discover in this packet trace is that PC2 has replied to ARP request with proper ARP reply unicast message back to PC1.

    Step 4
    Clearly, something between the computers (switches) does not work properly. It seems that we have some sort of unidirectional communication. What we need to establish is, where this unidirectional communication is taking place. We login to the SW1 and SW2 and issue the following commands ('x' here stands for switch number in Pic. 1):

    SWx#show mac address-table interface f0/1
    SWx#show mac address-table interface f0/24

    Test results:
    • SW1 learns source MAC address of PC1 (00:50:bf:9c:45:6a) on its Fa0/1 interface. This is expected.
    • SW1 does NOT learn MAC address of PC2 (00:1e:4f:b0:b2:fc) on its Fa0/24 interface. This is unexpected. It should learn it from the ARP reply sent by PC2.
    • SW2 learns source MAC address of PC2 (00:1e:4f:b0:b2:fc) on its Fa0/1 interface. This is expected.
    • SW2 learns source MAC address of PC1 (00:50:bf:9c:45:6a) on its Fa0/24 interface. This is expected.
    This way, we have discovered that SW1 has unidirectional link towards SW2 (SW2 sends frames towards SW1 but the latter does not seem to receive those). Probably, the fiber optics connection does not work properly (grease, dirt, a strand is broken etc.).

    One more time, this lesson illustrates how useful the commands and knowledge described in the previous posts, can be in real life scenarios. 

    In my next post, I will show you how to log system messages so they can be analyzed later. System messages are invaluable pieces information in the process of troubleshooting networking issues.

    Lesson 12 - Layer 2 Connectivity Troubleshooting Part 2

    In the previous lesson I attempted to show you how to go about a performance problems we might experience due to the duplex mismatch. It was our first trouble ticket in the series on layer 2 problems.

    In this lesson I am going to create another issue to show you how your current skills can be practically useful in diagnosing network connectivity problems.


    NOTICE
    The steps presented in this lesson are not the ALL possible diagnostics you can do. And they do not have to be done in this specific order. I am merely listing some logical steps which might be useful in order to 'nail down' the root cause of the problem.


    Trouble Ticket 2
    Some client computers have problems accessing FTP server located in the same network 192.168.1.0/24. This behavior is very intermittent.

    We need to collect a bit more data related to this problem. A good starting point might be to take down the following pieces of information:


    FTP Server
    IP address  = 192.168.1.4
    MAC address = 00:10:5a:d3:e4:e0

    FTP Client1 (with no connectivity to FTP server)
    IP address = 192.168.1.2
    MAC address = 00:10:4f:b0:b2:fc

    Here is one way of doing basic diagnostics.

    Step 1
    We want to make sure we have layer 1-3 connectivity between the Client1 and the server first (remember 'divide and conquer' method from the previous lesson?).

    jr@mandala:~$ ping 192.168.1.4
    PING 192.168.1.4 (192.168.1.4) 56(84) bytes of data.
    64 bytes from 192.168.1.4: icmp_seq=1 ttl=64 time=0.372 ms
    64 bytes from 192.168.1.4: icmp_seq=2 ttl=64 time=0.349 ms
    64 bytes from 192.168.1.4: icmp_seq=3 ttl=64 time=0.371 ms
    64 bytes from 192.168.1.4: icmp_seq=4 ttl=64 time=0.374 ms

    It seems we have layer 3 connectivity working just fine.

    Here's an interesting question for you. How do you know, you have received the replies from the FTP server in question (192.168.1.4)?

    Now, you may have very confused expression on you face as in 'what do you mean?'. Have I not just gotten the reply from it?

    Well, let me show you something to address this question.

    Step 2
    In order to be absolutely sure I got the reply from the FTP server (192.168.1.4) I need to verify my computer's ARP cache (Client1). Recall, that every time a host sends the packets, it encapsulates them in layer 2 frames (here Ethernet ones). In order to do that, the Client must have a valid destination MAC address (00:10:5a:d3:e4:e0) mapped to its IP address (192.168.1.4). If this mapping entry is not found in the APR cache, the Client1 will send ARP request to learn MAC address of 192.168.1.4. Knowing it let's check the ARP cache after we have sent ping packets. Here's what we find in (Pic. 1):

    Pic. 1 - ARP Cache Entries
    Pay attention to the highlighted entry. Is the MAC address mapped to our FTP server's IP address correct? 

    NO! 

    This MAC address does not belong to the server. FTP server's MAC address is: 00:10:5a:d3:e4:e0.

    That test proved that getting echo replies to our echo packets (ping) must also be verified in the sender's ARP cache.

    So, which device does this MAC address 00:10:0f:a3:3b:e6 belong to? And how on earth did it end up in our Client's ARP cache?

    Well, in order to answer the second question, the reasons for this wrong mapping might be different. There could be the device with the duplicate address (same as the FTP server). Then, when the clients send ARP request for the MAC address of the server, this 'rouge' device (not the legitimate FTP server), also responds to the query. And if its answers arrives later than from the legitimate server, the override the legitimate MAC address mapping in the ARP cache. If some other clients get the reply for their ARP query from the 'rouge' device first and then from the FTP server, they create the mapping correctly. That would explain why some computers can still FTP to the server and others can't. Another reason for this wrong mac-to-ip mappings might be ARP poisoning attack in the network (eg. using Ettercap tool).

    As for the answer to the first question, if this is not an ARP poisoning attack, you can use your knowledge from the lesson 9 about switching which helps you understand CAM table creation. Then, send uninterrupted ping towards the 'rouge' device and try to trace the mac addresses from switch to switch to find out where the device with duplicate IP address is located. MAC address table should help you find it relatively quickly.

    There might be one more question I would like to clarify. Why the duplicate address was not detected by the 'rouge' device? Answer to that question may surprise you. The duplicate address detection uses so called 'gratuitous arp'. This is an unsolicited advertising of the MAC address upon computer startup. If some computers use the same MAC or IP, the newly computer cannot use its IP in the network. Unfortunately, some operating systems may not use this mechanism. They 'trust' what the operator does. If she or he wants this address, there will be no protest on their part.

    I hope you are now beginning to see how much you already know!

    In my next lesson you are going to analyze the third trouble ticket for layer 2 connectivity. This is going to be the last troubleshooting lesson. The upcoming lessons will describe other tools and interesting technologies such as VLANs, Spanning-Tree Protocol. Once we finish with foundations related to layer 2, we'll start discussing layer 3 technologies and concepts such as IP addressing, subnetting, routing etc.

    Friday, September 17, 2010

    Lesson 11 - Layer 2 Connectivity Troubleshooting Part 1

    The last two lessons I tried to explain the foundations related to layer 2 operation. I discussed very important switching process and CDP protocol which comes in handy at times. If you also have watched the video in the lesson 10, you got a glimpse of few commands explained in the theory earlier on.

    In this lesson, we'll focus on practical application of the layer 1 and layer 2 concepts which could be helpful in troubleshooting networking issues.

    Even though we're capable of creating lots of great things we're still human beings. And 'to err is human' adage is as conspicuous in the networking field as anywhere else. This means, that occasionally things won't work as expected. In such situations we need to be able to isolate and fix the problems quickly.

    Reactive troubleshooting almost always uses the following work flow:
    1. A problem is reported.
    2. Data and facts are collected.
    3. Data analysis must be performed.
    4. Potential causes are eliminated.
    5. Hypothesis is drawn.
    6. Hypothesis is verified. 
    In some situations the steps 3 and 4 could be omitted. But it depends on the nature of the problem reported, severity of issue, skills of the technician etc.

    As we have yet to learn layer 2 technologies such as Vlans and Spannig-Tree Protocol which add complexity to troubleshooting process, let us, for now, focus on simple cases. This way we're going to build our troubleshooting skills step by step given the knowledge we posses.

    Recall the process of moving data from one computer to another. Everything sent from the application layer goes down to the physical layer. That teaches us one important thing: if the layer 1 is not operational, nothing else will work!

    So, we could start diagnosing the networking problems by checking the layer 1 connectivity first. Then, we could move on to the layer 2 and work our way up till we isolate the problem. This makes perfect sense. However, a lot of technicians use other method known as 'divide and conquer'.  In networking diagnostics that might be checking the layers in the middle. For instance using the 'ping' utility we can check the layer 1 through layer 3 status as a starting point. Of course, you already know how the 'ping' works, don't you? 'Ping' uses ICMP protocol which is a layer 3 messaging mechanism encapsulated directly in IP headers.


    NOTICE
    Ping utility is a layer 3 reachability checking mechanism. It sends a number of ICMP echo messages to the destination host. If the destination host receives echo messages, it sends ICMP echo reply messages back to the sender. Of course, assuming that there are no mechanisms implemented that would filter those messages along the path (like local firewall for instance), the sender should receive those replies and thus effectively checking layer 1 through layer 3 reachability. Thus, the path is verified in BOTH directions.


    If you aim at the layer 3 reachability using 'ping' utility, you may get one of the two results:
    1. You get the reply from the destination. This leads to a conclusion that all layers 1 through 3 are working properly. And the connectivity problems might be related to upper layers (layer 4 upward).
    2. You do not get the reply from the destination. In that case, this step is not enough to determine the nature of the issue. In this case, you must perform some additional diagnostics.
    Now, I am going to show you a few of such steps.

    NOTICE
    The steps presented in this lesson are not the ALL possible diagnostics you can do. And they do not have to be done in this specific order. I am merely listing some logical steps which might be useful in order to 'nail down' the root cause of the problem.


    Trouble Ticket 1
    Data transfer from PC1 toward PC2 is very slow (refer to Pic 1).


    Pic. 1 - Simple Topology
    Icons designed by: Andrzej Szoblik - http://www.newo.pl


    As you see, without the topology diagram, it is much more challenging to do diagnostics. If we know the topology and the path between the source and destination devices in question, we can focus on all components that participate in the data transmission and isolate the issue.

    First, let us determine what we know.


    The transmission succeeds, but is slow. Verify that yourself. Do not assume it's true or properly tested by the person who reported that. Most users cannot properly describe the nature of the problems.

    So, you have done the tests and the transfer proves to be slow indeed.


    Some questions you might ask:
    • Has this problem occurred recently?
    • Is this a new client computer or destination server?
    • Has any configuration/update/cable replacement/or any other changes been done on any devices in the path before the problem manifested itself?
    • Are other computers experiencing the same problem, or is this individual case?
    • etc.
    Further data and fact collection may depend on the answers to those questions. For the argument's sake let's assume, that the server is a brand new computer installed the day before the problem occurred. All clients sending files to the server suffer from slow data transfer. When you connected the same client to the server directly, the transfer is very fast (computer-to-server through cross over cable).


    Given those facts, we're going to take a quick look at the status of the interface where our server computer is connected. The command that is very useful to check both the layer 1 and layer 2 status is:

    SW2#show int f0/2 

    Pic. 2 - show interface command

    This output deserves some explanation.
    FastEthernet is up,
    This is the status of the layer 1 connectivity (your cable seems to be attached right?).

    line protocol is up
    This is the status of the layer 2. It looks like the keepalive packets sent every 10 seconds are working back and forth (do not trust it entirely though; look at the trouble ticket 2 in the next lesson).

    half-duplex, 100Mb/s
    Duplex negotiated is HALF duplex. Most network adapters used in the computers use AUTO negotiation. This means, that the NIC (Network Interface Card aka network adapter), sends special signals to the port of the switch trying to negotiate FULL duplex and the highest speed supported. Unfortunately, some NIC manufacturers do not follow the specification regarding this signaling. This causes some "misunderstanding" between the port of the switch and the NIC. Switch typically drops down the duplex from FULL to HALF. That causes the switch port to enable the Carrier Sense, Multiple Access with Collision Detection mechanism (CSMA/CD) which is used on SHARED not dedicated connections (for details look at the lesson 8).
    What we end up having is the NIC working in FULL duplex but the port of the switch runs HALF duplex. It 'thinks' it can either send or receive data but not do both at the same time. When server begins to 'push' data across the network, the port of the switch cannot send anything out towards the server. When it finally 'thinks' it can (medium free) and sends data towards the server, the latter begins to send data down towards the port as it is allowed in FULL duplex connections. The switch must stop immediately as the frames are experiencing collision. At least that is how the switch works under the circumstances. Then it waits till the carrier is free again (no data from the server down the port). This problem is known as: DUPLEX MISMATCH. That results in great number of collisions and late collisions recorded on the port of the switch like shown in the above output.

    The solution to that problem could be the following actions:
    1. Try to upgrade the NIC driver using your server/computer vendor's web site.
    2. If the problem persists, you may try to hard code speed and duplex. You have to do this on both ends of the connections. This disables the AUTO NEGOTIATION feature (do not listen to people who say you should do this on one end of the connection).
    3. Sometimes, though very unlikely in our situation, the cable can cause that sort of behavior. Replacing it to the one that is proven to be good, might help. Again, typically we would see some other layer 1 errors (CRC errors, carrier loss).
    4. Replace the NIC on the server to the one that you are sure is working well.
    Hard Coding Speed and Duplex

    SW2#configure terminal
    SW2(config)#interface fastethernet0/2
    SW2(config-if)#speed 100
    SW2(config-if)#duplex full

    As for the computer, you have to refer to the manual of your operating system  how to set speed and duplex on the NIC manually. Perhaps google this. Google are the best!

    In the next lessons, we will resolve two more connectivity issues given the skills we obtained in previous lessons.

    Thursday, September 9, 2010

    Lesson 10 - Cisco Discovery Protocol

    In the previous lesson we have explored how switches build their mac address table (aka Content Addressable Memory). It is critical to understand those concepts in order to perform troubleshooting related to connectivity issues.

    In this lesson we'll continue studying layer 2 technologies. Today's theme is Cisco Discovery Protocol. This protocol comes in handy in many situations (trust boundary for Cisco IP Phones, auto qos and others).

    What is Cisco Discovery Protocol?
    CDP is Cisco proprietary layer 2 protocol. It is enabled by default on majority of Cisco devices including IP phones. It can work on any connections supporting SNAP (such as LANs, but also ATM and Frame-Relay). The only time you'll see the CDP turned off by default, is when you use frame-relay configured interfaces.

    What does CDP do?

    Every Cisco device using this protocol, reports information about itself by advertising special packets out of its all active interfaces. The important pieces of information it advertise include its:
    • Hostname
    • Platform
    • Ports where CDP packets are advertised
    • IOS version
    • IP address
    CDP can help administrator discover Cisco devices connected and create a  topology diagram or prepare an inventory of the gear used. It can also be an additional tool in troubleshooting problems in the network. Working as a support technician, I found it useful numerous times.

    Let's take a quick tour through the CLI (command line interface) and see what major commands CDP allows us to use and what they show.

    I'm connected to my Cisco switch SW1 and in the privileged mode type in the following command:

    Pic. 1

    Using our best friend '?' we can see the CDP options. The last line '' stands for: 'carriage return'. A good, old-fashioned terminal lingo for 'press enter'. Let's try this first.

    Pic. 2

    Based on the output (Pic. 2) we see that CDP version 2 is enabled, the information packets (though technically should be called frames) are sent every 60 seconds. We also learn, that SW1 will keep CDP information it receives from its Cisco neighboring devices for 180 seconds (holdtime). Let's look at another CDP command:

    Pic. 3

    In the Pic. 3 we can see the traffic statistics such as CDP packets sent, received, any CDP encapsulation problems etc.

    The below command (output in Pic. 4) will tell you which interfaces CDP is running on.
    SW1#show cdp interface  
     
    You can disable CDP on a specific interface or group of interfaces. For instance, if you do not want to run CDP on Fas0/1 interface, you could use the following command:
    SW1#configure terminal
    SW1(config)#interface Fas0/1
    SW1(config-if)#no cdp enable

    If you want to disable CDP on a group of interfaces you can use 'interface range' command. For instance, disabling CDP on Fas0/1, Fas0/2, Fas0/5 and Fas0/8 would look like this:
    SW1#configure terminal
    SW1(config)#interface range fas0/1 - 2 , fas0/5 , fas0/8
    SW1(config-if-range)#no cdp enable

    NOTICE
    If you use the 'interface range' command, the consecutive ports can be specified with '-' but make sure your use 'space' before and after '-' (fas0/1 - 2). The same applies to non-consecutive ports (fas0/5 , fas0/8). There is 'space' before and after the comma character ','.


    As you will see later, CDP discloses some vital information (e.g. IOS version), so for the security reasons you may decide to turn off CDP altogether. Be careful before you do that though, as some application may rely on this protocol. Disabling CDP can cause cascading problems in your network. The command which disables CDP completely (on all interfaces) is:
    SW1#configure terminal
    SW1(config)#no cdp run

    Pic. 4

    Now, let's see what neighboring devices SW1 discovered by listening to their CDP packets (Pic. 5).

    Pic. 5 

    Dissecting the Pic. 5 output we learn the following:
    • SW1 received CDP packets from the device named 'R1' (hostname).
    • This CDP packet was sent from R1's Fas0/0 interface (the last column 'Port ID').
    • SW1 received this CDP information packet on its Fas0/1 local interface ('Local Intrfce').
    • This leads us to a conclusion that R1's Fas0/0 interface is directly connected to SW1's Fas0/1 interface.
    • R1 neighbor is a router as the capability list shows 'R S I' (R=router, S=switching capability, I=IGMP support).
    • R1 is 2611XM platform.
    That's not all by any means. There is another command we can use to obtain more information about R1. Click at the Pic. 6

    Pic. 6

    It shows you information about IOS version running on R1, as well as its IP address 192.168.10.254 configured on Fas0/0 interface. Now you understand why you might consider turning off CDP on some interfaces. You do not want to show such details to a third party company (like your service provider), that connect to your devices.

    Instead of using 'show cdp entry R1', you can also use the following command that displays detailed (similar to Pic. 6) output about ALL discovered neighbors :
    SW1#show cdp neighbor detail

    In lesson 11, you will learn the commands related to switch mac-address-table which was covered in theory only (lesson 9). Also, we'll hone all our skills we have obtained so far. Things will begin to fall into place. At least that's my hope.

    A practical application of the two last lessons below.