• October 31, 2020, 05:31:56 AM
Welcome, Guest. Please login or register. Registration is free.
Did you miss your activation email?

Author Topic: ERS 3500 5.1 memory leak  (Read 2593 times)

0 Members and 1 Guest are viewing this topic.

Offline Johan Witters

  • Sr. Member
  • ****
  • Posts: 252
    • BKM Networks
ERS 3500 5.1 memory leak
« on: June 21, 2013, 07:29:10 PM »
Hi guys,

has anybody noticed a memory leak in the 5.1.0 release for the ERS3500 series switches?

At a customer we installed 5 stacks and a few standalone switches, currently not yet operationally used by the customer. Today we organised a training session and noticed that all stacks had failed after an uptime of 2 weeks. The standalone switches had all rebooted...

What we noticed:
- all stacks had the base led blinking fast
- unit 1 was unreachable, accessing the unit through the cli does not let you get past the login banner
- other units were accessible through the cli, but all reported to be standalone units
- when rebooted, the stack again operated normal..

Upon further investigation we stumbled on the following entry in the log: NVR Memory on unit 1 is under 20 MBytes. Due to the fact that all stacks were initially powered up at about the same time, this log entry appeared at the same time on all stacks, around 7 days of runtime. A week later we would find a entry stating unit 1 had left the stack.

Worried about this, we also checked the standalone units. On all units we found the following entries
C    2013-06-07 02:43:14 GMT+01:00 15   NVR Memory on unit 1 is under 20 MBytes.
C    2013-06-14 00:07:43 GMT+01:00 16   NVR Memory on unit 1 is under 5 MBytes.
C    2013-06-14 11:12:14 GMT+01:00 17   NVR Memory on unit 1 is under 4 MBytes.
C    2013-06-14 22:08:11 GMT+01:00 18   NVR Memory on unit 1 is under 3 MBytes.
C    2013-06-15 20:11:44 GMT+01:00 19   NVR Memory on unit 1 is under 1 MBytes.

next entry would be a software exception or "cold boot trap"....

All standalone switches automatically rebooted and came back only, but running show memory shows me something like
------------------------------------------
          Memory Utilization
------------------------------------------
Unit  Total      Used          Free
------------------------------------------
1  128Mbytes     100Mbytes     28 Mbytes

free memory is between 22 and 28 Mbytes on all switches, and on the base units of the stacks according to the runtime of the unit. In normal cases free ram should be around 40Mbytes

I checked this with a different installation I did about 2 weeks ago, the switches have a current uptime of about 6 days and are all running at around 36Mbytes on single units and base unit of the stacks..

The issue seems to be happening on at least all GT-PWR+ models running 5.1.0, latest diag and boot. The config is not to complicated: 5 vlans, rstp with admin-edge, vlacp on the uplinks, 2 port mlt to the core 5520 stack.

I disabled vlacp on 1 stack, and the mlt 1 on a different one as I could find a stack trace for the vlacp task on 2 of the crashed switches. Let's hope this makes a difference...

I'm currently waiting for our distributor to get the support contracts activated, as soon as I can I'm opening a ticket..

Kind regards,

Johan Witters

Network Engineer
BKM NV