Tuesday, December 25, 2018

RSTP/MSTP Part –V How does switching take Place in RSTP and MSTP?



 Dear friends of the Telecom Fraternity,

I was writing a series of RSTP in 2013 and possibly, there are many things to catch up. I would like to divert your attention to this blog that was the fourth part of my RSTP series.


It is from this blog we continue our journey ahead to this vast topic called RSTP. Generally, people believe that RSTP is for switching, but I had clarified before that this is a Loop avoidance Mechanism. RSTP makes the switching of services in case of Failure more salient because there is a transition of paths.

In this section, we will concentrate on a simple architecture and that is RSTP Ring. As explained before we optimize the ring with a selection of optimal blocking port in the service.
Figure-1: RSTP ring topology example


Now in this ring let us understand that there is a service from the root Bridge to N-2 with SVLAN 200 and another service from Root Bridge to N-3 with SVLAN 300. This can be shown in the figure below.
Figure-2 Service configuration in the RSTP domain


To understand the concept of service switching let us understand a failure scenario. So in our case we imagine that the link between N-1 and N-2 has failed. Definitely, the service will be routed from another direction. However, in our case we will see this in a step-by-step basis.  Please remember that a service switching in RSTP is not as simple as it looks. Because there are no predefined main and protection paths like you have in TDM or MPLS. Here the entire switching of the service from one direction to another is working broadly on two principles.
  1. RSTP Re-convergence
  2. Mac Learning renewal at all switching points. 

We will see all these happening but you have to remember that all these happen very instantly. Typically the re-routing time of the services in the event of failure in RSTP is 200ms. Note this is not 50ms and that is why it is not recommended to run voice services or any real time services involving voice in the RSTP network. This is the reason why RSTP is regarded to be a Non-Carrier grade method. However, for a normal http service or a https service it does not matter as there is a TCP retransmission always happening and so RSTP works very well.

So we see the figure below to understand the failure scenario. 

Figure -3 Failure Occurrence in the link



 After this failure has occurred the first thing that happens is that N-2 does not get the BPDU packets from the N-1 (its designated bridge). So a Root port transition takes place and the link that is between N-2 and N-3 becomes a forwarding link. One special thing to remember over here is that in the N-2 to N-3 link, which is the blocking one of the ports between N-2 and N-3 will be the discarding port. RSTP in this case will not have two discarding port. So we have two cases over here. 

1. In case the discarding port is N-2 and there is a Root port failure on the N-2 then the Topology Change request of RSTP will immediately come into action and N-2 discarding port will turn to forwarding. 

2. In case the discarding port is N-3 then the TC message is communicated from N-2 to N-3  and N-3 changes the port from discarding to forwarding. 

Here the critical part is the topology change notification message that is carried by the BPDU and this always happen after a minimum hold-off time which is 200ms. The difference between STP and RSTP is over here. In STP there is a wait of three hello intervals which makes the initiation of TCN happens delayed. This results to a switching time that is more than 3 seconds. However, when we talk about RSTP (Rapid Spanning Tree Protocol) the TCN notification are subject to port transitions in any switch. Therefore N-2 and N-1 will both have transition changes and will initiate TCN immediately after the expiry of the hold-off timer. 

Now after the TCN is communicated the new state of RSTP will be as the figure below. Please note we have not yet considered how the service is being rerouted, we are still seeing the first part of the switching and that is RSTP re-convergence. 

Figure-4: RSTP topology change

Now the topology change has occurred. But what is remaining still is the re-routing of the service. I told this earlier that RSTP does not have a pre-defined protection and main path so the service re-routing is happening plainly on the basis of Mac Learning. RSTP is a scheme that is used in the case of Provider Bridge networks. To understand what is a provider Bridge network please refer to my earlier blog post in the permalink given below. 


In this blog post you will find clearly how the traffic moves in the provider bridge networks. So as this is a provider bridge we see that for the service affected, which is the service with SVLAN = 200, the mac learning has been done in the following manner of (Root Bridge - N1-N2). Now the path between N1-N2 has failed and there has to be a sort of notification to the root bridge to send the traffic via the other path. 

The self healing way of such a scenario is that the traffic stops and we wait for the expiry of the aging time of the mac table. The aging time of the mac table is a user configurable parameter, however the minimum value is 10 seconds. So technically if such a failure has occurred the service rerouting should take place after 10 seconds (aging time). 

Phew!!!!!! This is long. So the developers of RSTP thought of another approach and this was to flush the mac-table of every bridge that is involved in the RSTP domain. Therefore, the TCN also sends a command to flush the mac-table of all the bridges involved in a particular RSTP domain. 

Something like the figure below. 

Figure-5 Mac-Flush happening n all the nodes involved in RSTP

Here we see that all the points of the RSTP domain are flushed. 

Now it is anybody's guess what will happen after the flush of the FDB occurs. There will be relearning of mac address for the services. In this case the service with SVLAN 300 will have the same path of mac learning but the service with SVLAN-200 will not have the same path of mac learning. Now N-2 which is the destination point will learn the mac via N-3 and not N-1 and this will make N-3 the Designated Bridge for N-2 and the service will now be re-routed. 

Figure-6 Final Re-routing of the service

So here we see a complete step-by-step process of re-routing of the services. Tough but not so tough to understand. 

In this case please note, now the bandwidth distribution in the ring is not optimized and there can be a scene of congestion between the link of RB to N-4 and N-4 to N-3. Under such scenario the QoS will come to play and the RSTP domain has to be properly traffic engineered. 

What happens when the link restores?

Now we saw about the failure the restoration of the link is also treated like a seperate failure in this case. RSTP recognizes only topology changes and now with the link repaired there is another topology change. A similar TCN will pass through the ring and there will be re-convergence and the block port will now be as per before. The TCN will flush the Mac tables of all the bridges and this will lead to service re-routing again. 

So friends, pretty long blog post, but cannot help. In order to understand the switching part there has to be more description, which I have tried to bring in. But, we have just touched the tip of the ice-berg. There are lot many things happening beneath the skin of the water and to dissect it threadbare it would need another 50 blog posts. We will see the operational aspect of RSTP in multiple topology scenarios as well and dual homing cases. 

Till then 

See you.

Regards, 

Kalyan 

Keep thinking!!!! Keep Reading!!!! Keep Evolving!!!!



3 comments: