• December 05, 2020, 06:08:49 PM
Welcome, Guest. Please login or register. Registration is free.
Did you miss your activation email?

Author Topic: Looking for info to configure LAG / link aggregation to VMWare stack <-> ERS4500  (Read 4110 times)

0 Members and 1 Guest are viewing this topic.

Offline dblagbro

  • Rookie
  • **
  • Posts: 4
    • Devin Blagbrough
So I have a good deal of experience with non-Avaya(blue)/Nortel hardware however despite working for a major Avaya Business Partner, I have not worked with the Avaya(Blue), formerly Nortel hardware.    So, every few years I have to update my LAB and I pickup refurb hardware online and build a new LAB for my own testing of customer's scenarios as well as self-training.   Well I'm about 4 months into building the next rendition of this LAB and I keep getting tripped up on something and I'm figuring this involves the way the core switch is setup at this point.

I found a ERS-4524GT (24 x 1Gb ports - last 4 ports are RJ45 or Shared SFP) at a reasonable deal, it was supposedly never pulled form its box and had its original seals so I'm hoping it's not a defective hardware issue - I'm not using an SFP's and I'm running the latest firmware (5.7.1.021) .     I am not using default VLAN, I'm using a VLANID that matches my 3rd octet and I setup VLAN 99 as my VoIP VLAN which currently has only 1 host for an SBC interface but will become my Avaya VoIP subnet some day.

I have 2 HP DL360's running VMWare ESXi 5.5 with vCenter - each server has 4 built in NIC and 2 add-in NIC (all 1Gb ports).   I've reserved the 4th NIC on each server for iSCSI which, along with the NAS/SAN iSCSI source, are currently running on a different 1Gb switch.

The different 1Gb switch mentioned above is connected via a dot1Q trunk as just connecting this switch as a regular access port doesn't let any of the other ports on the switch communicate...  even with "untagall" "untagPVIDonly", etc all options - so I'm using it as "tagAll" and both switches agreeing on VLANs (manually configured to match).

I have an Avaya G350 with the 24 port PoE LAN module with 1Gb uplink - also with dot1Q trunk between the ERS and the G350, all ports on the G350 can communicate with ports on the other 1Gb switch but not when the G350 is connected as an access port to the ERS.

So, trunking works fine...   however for the last month +, the results I've had were:

When I connect a PC or server direct to the ERS with "untagPVIDonly" as I would expect for a PC and/or an Avaya IP phone with PC behind it, I get intermittent PINGs... not consistent.  Also it can take up to a couple minutes for DHCP to work.

The only devices that I seem to be able to connect to the ERS are those that support dot1Q trunking...  any others work intermittently.    This means that my VMWare servers have been working for their VMnetwork with the hosts on it and the management IP, however, the iSCSI interface with only 1 IP and no dot1Q tagging, were not working consistantly (which is why above I mentioned they and the SAN/NAS are on the other 1Gb switch I have).

So, I had been working like this for the last couple weeks - I only have been working on this VMWare system when I have time and there are no production devices on them.  I still have my old server and LAB and that one runs the phones in my house / at my home office desk....  so I may not have noticed this until now but I also now have noticed issues with the VMnetwork - the new domain I'm building for the LAB can't consistently communicate with the outside or the rest of my home network.

When they do, sometimes I can reach the VMWare host, but not guests on it - yet those guests can PING the VWare host - and also they can and cannot intermittently PING the SAN/NAS - yet their iSCSI connection has been stable since it was moved to that 'other 1Gb switch' I keep mentioning above.

Now that other 1Gb switch is a cheap Netgear 8 port switch that supports LACP, trunking, etc.   It trunks to the ERS fine, and the G350 trunks to the ERS fine... but now I've found that even the VMware VMnetwork which is setup for load ballancing and dot1Q trunking to the ERS4500 is occasionally losing it's connectivity to0.

TL;dr:   I'm thinking this may have to do with the 3 NICs on VMWare hosts tied to the ERS ports, and how those ports are  all configured....  but I have tried many combinations of settings however I have seen no improvement.  I can't find any docs saying how this should be connected if using both multiple NICs and dot1Q trunks ----   does anyone have any suggestions or know of any best practices for settings on each device (ERS and VMWare hosts)?

Thanks in advance!

-d


« Last Edit: February 26, 2015, 02:04:18 PM by dblagbro »


Offline CptnBlues63

  • Sr. Member
  • ****
  • Posts: 100
So, trunking works fine...   however for the last month +, the results I've had were:

When I connect a PC or server direct to the ERS with "untagPVIDonly" as I would expect for a PC and/or an Avaya IP phone with PC behind it, I get intermittent PINGs... not consistent.  Also it can take up to a couple minutes for DHCP to work.


I am by no means the most knowledgeable person on this site but I'll take a swing at this one.

When interconnecting Avaya switches you should be making both sides trunk ports.  For example's sake, we'll say VLAN 1 is your baseband and you also have VLAN's 2-5 on your switches for data, VoIP etc. 

Here's how the trunk ports on both switches should look:

PVID = 1
Allowed VLAN's = 1-5
Tagging = tagAll(trunk)

Client ports should be as follows:
PVID = 2
Allowed VLAN's = 2
Tagging = untagAll(access)

Two clients within the same subnet (ex: 192.168.2.0/24) plugged into access ports with the same VLAN (ex: 2) should be able to communicate regardless if they're on the same switch, or you have one plugged into each of your two switches.  If they're not communicating, then something isn't configured correctly.

The client ports should not be "untagPVIDonly"

As per the Avaya documentation:

untagPVIDonly = sets the port as a trunk port but only adds the 802.1q headers for every VLAN other than the PVID VLAN as they egress (exit) the port

untagAll = sets the port as an access port stripping all 802.1q headers as they egress (exit) the port


TL;dr:   I'm thinking this may have to do with the 3 NICs on VMWare hosts tied to the ERS ports, and how those ports are  all configured....  but I have tried many combinations of settings however I have seen no improvement.  I can't find any docs saying how this should be connected if using both multiple NICs and dot1Q trunks ----   does anyone have any suggestions or know of any best practices for settings on each device (ERS and VMWare hosts)?

I can't really comment on the VMware stuff (someone else administers those boxes) except to say that we use the internal VMware virtual switch on the dom 0's and use trunk ports from the VMware servers to the switches.  Typically, in an MLT/SMLT configuration for redundancy.

I think you need to start with the basics, get your switches trunk ports and client access ports working first, then move on to dealing with the VMware issues.
« Last Edit: March 04, 2015, 04:21:31 PM by CptnBlues63 »

Offline dblagbro

  • Rookie
  • **
  • Posts: 4
    • Devin Blagbrough
Thank you for taking a stab at it CptnBlues63!    Fish much?  ;-)

Anyway, I hadn't had time to reply but 2 nights ago I stumbled across this link which led me to other settings I wasn't accustomed to using from Extreme/Cisco/HP, etc switches:   www<dot>datarave<dot>net/zfh/2013/09/19/avaya-ethernet-routing-switch-untagged-and-unregistered-frame-filtering/#sthash.nTNHgS7C.Z0FVLQlT.dpbs


The "best practices" I found here and suggestion's about the unregistered / untagged frames have had an effect....  though I'm not saying solved it, there still seems to be an issue with LACP settings when I try to use more than 1 NIC on the VMWare servers for extra bandwidth / redundancy.   In other words, I was facing compound issues and seem to have fixed one of those issues - I don't know if it's 1 left or more than 1 at this point.

With the filter unregistered frames setting where appropriate, I am now working much better - over 40K PINGs with only 1-2 drops...  well within acceptable range for me for now in a LAB situation.

If anyone reading this is having issues with ARP / inconsistent PINGs on ERS switches, check into those unregistered / untagged frame settings....   they are on the VLAN tab, but they are in the middle and not much is mentioned about those settings in most online help I have found... in fact the above link is the only one I found in several weeks of searching.


... also, if anyone reading knows about the LACP settings that I should try for both sides, the VMWare load ballancing options as well as what they coorespond to on the ERS ports, please let me know.   At this time  I think the only issue remaining is getting some LAG setup and getting all 3 NICs working at once.

Thanks again!

Offline CptnBlues63

  • Sr. Member
  • ****
  • Posts: 100
I had a chat with our VMWare admins and confirmed that they use the Virtual Switch within VMWare to create "tag all" trunk ports which are then connect to our switches (VSP7000's)  They told me the NIC's on the boxes themselves had to be capable of tagging as well.

The switches themselves are in a redundant cluster with IST's.  So each VMWare box will have no less than two (redundant) connections to the cluster, one to each side.  This way, if a switch fails, they're still communicating properly through the other side.

If we were doing this to a single switch, I would setup an MLT (Multi Link Trunk) to the physical box from my switch.  (Likely it would be a stack and the uplinks would go to separate switches within the stack to give as much redundancy as possible).  This would provide redundancy as well as aggregate bandwidth.

Again, I would configure the switch ports as tagAll(trunk) ports and then create the MLT.  In your situation, with a single switch, you could still make an MLT and that would be my preference over a using LACP to create a LAG (link aggregation group).

Our present connections to the VMWare boxes are setup as trunk ports on the Avaya switches are all configured as "tagAll(trunk)" with the two following checked:
- DiscardUntaggedFrames
- FilterUnregisteredFrames


I hope this helps.

Offline dblagbro

  • Rookie
  • **
  • Posts: 4
    • Devin Blagbrough
Thank you!   I will try this out this evening and report back once I know results (sometimes takes a while before I know there are issues I have noticed).

Thanks again!

Offline tbigby

  • Full Member
  • ***
  • Posts: 60
One thing that catches us most times when we do VMware MLTs is to go into the VMware virtual NIC settings and make all of the network ports in the MLT as 'active'. By default, VMware settings have the first network link as 'active' and the rest are 'standby', which means that when the switch sends traffic or pings down one of the links that VMware sees as standby, the traffic is dropped.

From memory we need to do this on both the Management (vmkernel?) vSwitch and the Data vSwitch. Our setup may be different to yours however but we have to do this in both places. Usually we forget one and wonder why we're having trouble with more than one link active.

I don't remember off the top of my head the exact location of the 'active'/'standby' link selection, but let me know if you can't find it and I'll look it up.
Tony Bigby

Offline dblagbro

  • Rookie
  • **
  • Posts: 4
    • Devin Blagbrough
Thank you too Tbigby...

Fiance works from home and had a project cutting over last night so I didn't try anything yet.   I look forward to more testing with these additional details and I will report back for others in my situation.

-d