• May 22, 2012, 09:23:15 PM
Welcome, Guest. Please login or register. Registration is free.
Did you miss your activation email?

Author Topic: Windows NLB over 5520's - config assistance needed, Avaya India dont understand  (Read 737 times)

0 Members and 1 Guest are viewing this topic.

Offline Phil Crawford

  • Rookie
  • **
  • Posts: 9
All,
I posted a while back with an issue where we experience network outages at our DR site causing all network devices to drop off monitoring and subsequently all connections in and out are disrupted. Broadcast symptoms.

So after packet sniffing the VLAN generating all this traffic there was a pattern. Traffic was broadcasting meant for a Virt NLB address (Unicast). The outages coincided with elavated levels of traffic being sent to the NLB environment.
The reason for the broadcast is the lookup in the ARP table does not find the virtual MAC so floods to all hosts on the switch, thus wrecking the entire subnet and most switches on the LAN.

We disolved the NLB cluster so all 3 members ran separate and this has resolved the outages.
Now the problem is we need this NLB cluster in place.

I have been speaking to a team at Avaya in India, they dont seem to understand what is happening despite me holding their hand and leading them up the path the cause. Me dissolving the NLB cluster now proves what i was saying all along.

Now i need to know how to configure this correctly, ive seen this before over Extreme Summit X series and there is a work around by looping a cable back in to a non routable private subnet housing the cluster members, this is quite primitive and think Extreme have resolved this now with a SW release.

I have seen an Avaya white paper on this issue but not specifically for 5520's running at layer 2.
We have 2 x Cisco 3560's running HSRP as the Layer 3 engine, below that a stack of 2 x 5520's distributing to 6 x single 5520's housing all hosts (servers). The cluster in question has 3 members running Unicast mode and all reside on the same 5520 switch.

I have been digging around but cannot find anything specific to our issue.
We need to be concentrating on how the switch handles traffic to a virtual address which is not present in its ARP table hence flooding out all interfaces.
I have been thinking of a couple of ideas round this but dont want to lead anyone up those paths as yet.
I would appreciate if any of you guys have seen this before or know how to get round this.
Cheers
Phil


Offline Dominik

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 662
Hi Phil,

first I would recommand this document:

http://support.avaya.com/css/P8/documents/100123894

Here you can find a very good techical guide how MS NLB and Avaya switches work together.

If you have still your problem with the AVAYA recommanded NLB settings, you probaly have SW bug.
Wich SW release do you run on your ERS5520 ?

Cheers
« Last Edit: January 03, 2012, 11:35:55 AM by Dominik »
Itīs always the network...

Offline Phil Crawford

  • Rookie
  • **
  • Posts: 9
Thank you for your reply Dominik,
i do have this document already. Avaya have come back to me today saying this is a known bug which is resolved in 6.3.0.0.
This SW is not due for release until at least March and could be much later which doesnt really help in the interim.
Does anyone know a temp work around?

Cheers

Offline Flintstone

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 584
Hi Phil,

I would have thought you would be able to statically add the virtual MAC address to the FDB as a work around?

CheerZ

Offline sarahtanembaum

  • Rookie
  • **
  • Posts: 9
Flinstone, do we have to add the virt MAC on the router instead of 5520 since it is running on L2 mode.

Offline Flintstone

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 584
Hi Phil,

You can also statically add the virtual MAC/IP address to the router as well if that prevents the broadcast symptoms you are experiencing?

CheerZ

Offline Dominik

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 662
Hi Phil,

what do you want to achieve with NLB ?

If you want to accomplish link aggregation over 2 NICs on the server you can try as a workaround to use
a link aggregation with the NIC drivers instead of using the MS NLB.

This is a possible workaround depending on what you want to do with NLB.

Cheers
Itīs always the network...

Offline Phil Crawford

  • Rookie
  • **
  • Posts: 9
Firstly i would need to enable routing on the 5520 to add a static ARP/MAC entry, i would also need to assign an ip interface to the specific VLAN. May or may not work.

We have the NLB environment to load balance across 3 physical servers.
They are currently disolved and all working independantly, OK at the moment.

Avaya assure me that the SW upgrade will resolve this incompatibility issue, i find it hard to believe this has not come up before with a sounds work around.

Ive had another email today saying the release will be available by the end of February. If my client needs to push before then i may look at the static entries but im not 100% sure this will work. Ideally i would need to set this up in a lab first. Avaya said they were doing this but never came back with anything apart from a SW upgrade.

Cheers

Offline Michael McNamara

  • Administrator
  • Hero Member
  • *****
  • Posts: 2517
    • Michael McNamara
Hi Phil,

I'm just curious about your problem... when you mention ARP table are you referring to the MAC/FDB table? As you mention above the ERS 5520s (Layer 2 only) won't have an ARP table only a MAC/FDB table since they are a Layer 2 device. It's the Cisco 3560 (Layer 3 device) routers that will have the ARP table.

Are you saying the the ERS 5520s are not learning the Multicast MAC address from the source port?

As mentioned you could create the MAC/FDB table entry... any ARP table entries though would need to come from the Cisco 3560.

I've played with NLB quite a few years ago and we almost immediately abandoned it for a real Layer 7 switch (Cisco ACE, Radware Alteon). What traffic are you looking to load balance? Web servers?

Good Luck!
We've been helping network engineers, system administrators and technology professionals since June 2009.
If you've found this site useful or helpful, please help me spread the word. Link to us in your blog or homepage - Thanks!

Offline Phil Crawford

  • Rookie
  • **
  • Posts: 9
Hi Michael,
In my first post i referred to ARP when i meant FDB entry. Think the confusion comes when adding an ARP you add the MAC too
Im aware i would need to enable routing on the 5520's to add ARP entries etc.

The switch does not learn the VIRT MAC which is where i think the problem is with the software and Avaya assure me this will be resolved in 6.3.
I can see that all traffic passing between servers is being flooded, i ran a sniffer on a machine as a host on the server subnet and captured for a period of around a week. During the outages we were getting around 400 - 500MB per minute (flooded). I was capping the pcap files to 100MB each, my 1TB drive was filling up nicely.

The traffic being accessed on these servers is destined to multi national clients for an international economic magazine, this data center is their primary source of information. The traffic varies for database access. Clients access a terminal server (1 of many) which in turn access the NLB cluster on its virt address. This number fluctuates depending on the timezone and who in the world is awake and working.

I think we need to sit tight for the 6.3 release, test then make a decision on moving forward. If we still have instability i will suggest we move away from NLB completely.

Thanks again.
Phil

Offline Michael McNamara

  • Administrator
  • Hero Member
  • *****
  • Posts: 2517
    • Michael McNamara
I can see that all traffic passing between servers is being flooded, i ran a sniffer on a machine as a host on the server subnet and captured for a period of around a week. During the outages we were getting around 400 - 500MB per minute (flooded). I was capping the pcap files to 100MB each, my 1TB drive was filling up nicely.

It's been a while so forgive me if I'm wrong but isn't that how NLB works? It essentially floods the traffic to all the servers, and all the servers expect one ignore the traffic.

I do recall that there was a Multicast option that helped to cut down on this traffic.

What mode are you running NLB in?

Good Luck!
We've been helping network engineers, system administrators and technology professionals since June 2009.
If you've found this site useful or helpful, please help me spread the word. Link to us in your blog or homepage - Thanks!

Offline Phil Crawford

  • Rookie
  • **
  • Posts: 9
Correct for the servers but our problem is traffic destined for the VIRT MAC hit the FDB where there was no entry as the switch has issues associating 1 MAC to multiple destinations. The switch in turn floods the packets out of all interfaces hence flooding the whole subnet (which spans multiple switches) and subsequently causing our outages given the large amount of traffic.
We are running Unicast mode. 1 thoughts was to go Multicast mode and try IGMP snooping. Avaya steered away from this.
With Extreme (summit x series) a couple of years ago i had this issue, we created a static FDB entry pointing to 1 interface, this interface then looped physically back in to the same switch to a private VLAN (no ip interface) containing all cluster members. Worked very well and no problems creating LAG for increased bandwidth back in to this VLAN. My client wishes not to go this route hence speaking to Avaya and impending SW release 6.3.

There has always been an issue with NLB across multiple vendors given the MAC forwarding issue, no problems on older hub's but in a switched environment with forwarding decisions it creates problems. I think Extreme have resolvved this now in later release XOS and Cisco have a solution too.
Perhaps this has not come up in our specific scenario with Avaya/Nortel before.

The whitepaper for NLB and Avaya is not specific to our setup.
Thanks for all input and comments, its reassuring bouncing ideas and issue on here.

Without it getting too messy i think we should wait and see what 6.3 has to offer.

Cheers

Offline bwilliams2

  • Full Member
  • ***
  • Posts: 50
Phil,

Have you received an update from Avaya?