• September 21, 2020, 09:31:33 AM
Welcome, Guest. Please login or register. Registration is free.
Did you miss your activation email?

Author Topic: SMLT recovery time?  (Read 3263 times)

0 Members and 1 Guest are viewing this topic.

Offline ijdod

  • Rookie
  • **
  • Posts: 20
SMLT recovery time?
« on: July 10, 2015, 05:19:07 AM »
When testing some scenario's in a lab setting, I noticed SMLT 'failback' recovery times were a bit longer that I expected.

Setup is fairly simple: two clients, each connected to their own 5520 switch. Each 5520 has an mlt to a VSP7000 SMLT cluster (running v10.3.3). Tests are done by disconnecting the links between one the 5520s and the VSP7000. Downtime is measured with iperf pushing ~1000 UDP packets per second.

The initial disconnect failover is very fast, between 2 and 100 msec, depending on the link. However, when I reconnect the link, traffic is interrupted for about 250 msec (which is okay) to 1 second, again depending on the link. With the whole sub-second failover sales-pitch in mind, and also the logic behind this, the whole second surprised me. I would have expected the initial failover to take longer than the recovery.

VLACP and L2 vs L3 do not seem to have any effect on the test results. Recovery times seemed to improve a bit (to ~800 msec) when both clients were on the same switch, in different vlans (L3).

Are these normal numbers for this technology?

Edit: Some additional info: when I disconnect the (test-traffic carrying) link on a 5500 with v6.3.3, the recovery time is about 1 second. If I disconnect the link on the other 5500 (which is running v6.1.2), the recovery time is 500 ms. That's going from 'we'll be fine' to 'users will notice' when you update O.o.

The exact same behaviour was seen with a 5650 instead of a 5500. 10 GbE vs 1 GbE didn't matter (tested with a 5632)



« Last Edit: July 10, 2015, 08:10:42 AM by ijdod »


Offline Dominik

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1564
    • Networkautobahn
Re: SMLT recovery time?
« Reply #1 on: July 10, 2015, 09:18:49 AM »
Hi,

what you see is the normal behaviour in my expierence. Most times the lost of a device is an unexpected action and to bring it back online is a more controlled situation. So allways be aware that adding a missing member to an IST switchcluster brings a small outage with up to nearly ~ 1 second failover time.

Cheers
Itīs always the networks fault!
networkautobahn.com

Offline ijdod

  • Rookie
  • **
  • Posts: 20
Re: SMLT recovery time?
« Reply #2 on: July 10, 2015, 09:58:15 AM »
Thanks. What mainly triggered my surprise is that these are roughly similar values to RSTP. I actually just tested that, too. RSTP is a bit quicker on the first failure, but slower in the recovery. I'm guessing that in my mind's eye I chucked all STP versions together, and interpreted Avaya's claims as 'faster than STP' to include RSTP as well. Teaches me to assume things :D.

Does this apply to the 8600/8800 platform as well?


Offline Dominik

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1564
    • Networkautobahn
Re: SMLT recovery time?
« Reply #3 on: July 13, 2015, 04:16:28 AM »
With RSTP vs SMLT you have also to consider that on a SMLT/RSMLT setup ~50% of the connected Hosts
will not see the outage of one of the Switchcluster pairs. Here you have a active/active loadbalancing where  in case of an outage of one of the IST Cluster pairs only the hosts that are connected to that box will see the outage. That is a big advantage over RSTP in my opinion.

If you looking for the fastest failover times I would recommand to take a look at SPB.

Cheers
Itīs always the networks fault!
networkautobahn.com

Offline ijdod

  • Rookie
  • **
  • Posts: 20
Re: SMLT recovery time?
« Reply #4 on: July 13, 2015, 07:47:44 AM »
I can see some advantages and disadvantages for both SMLT and RSTP. If I were to implement a STP, it would be MST, which negates part of the advantages SMLT has. Having said that, SMLT is the default for us, so that's what we'll use unless there's a reason not to. In this case, RSTP was just looked at because it appeared to have similar failover and recovery times (which we didn't expect). In itself the recovery is fast enough for most of our purposes, but some of our layer 8 is very keen in subsecond failover. SPB seems very nice. It's on our roadmap, but we're playing the wait-and-see game for now. We've been burned by Avaya software QA a few times too many. :P

Does anyone know if the SMLT failover/recovery times for the 8800 are similar in this fairly straighforward scenario? Or does that platform manage a faster recovery?


Offline Dominik

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1564
    • Networkautobahn
Re: SMLT recovery time?
« Reply #5 on: July 13, 2015, 09:40:15 AM »
In my expierence the ERS8k has the same behaviour regarding to SMLT failover times to that what you have seen
on the VSP7K and ERS5K.

The worst case in an SMLT design that I have seen is 1 second. For the RSTP is the worst case 3times 2Seconds.
Of course here all depends on your needs.

I think it is fair to say that SPB is now at the point where it is available from small to big boxes and has been deployed from small to large networks and has proven to be reliable and stable.

I agree with you that after been burned by software bug it is hard to trust an new technology.
To be fair I can say that have seen software bugs on all vendors that I have worked with.
Good for all network admins that safes our jobs...



Itīs always the networks fault!
networkautobahn.com