We were having issues with our backup jobs failing on CIFS share backups using Symantec Netbackup. The jobs died with a “status 24″, which means it was losing communicaiton with the source. Our backup administrator provided me with the exact times & dates of the failures and I noticed that immediately preceding his failures this error appeared in the server log on the control station:
2012-08-05 07:09:37: KERNEL: 4: 10: Dynamic allocation pool limit has been reached. Limit=0x30000 Current=0x50920 Max=0x0A quick google search came up with this description of the error: “The maximum amount of memory (number of 8K pages) allowed for dynamic memory allocation has almost been reached. This indicates that a possible memory leak is in progress and the Data Mover may soon panic. If Max=0(zero) then the system forced panic option is disabled. If Max is not zero then the system will force a panic if dynamic memory allocation reaches this level.”
Based on the fact that the error shows up right before a backup failure I saw the correlation. To fix it, you’lll need to modify the Heap Limit from the default of 0x00030000 to a larger size. Here is the command to do that:
.server_config server_2 -v “param kernel mallocHeapLimit=0x40000″ (to change the value).server_config server_2 -v “param kernel” (will list the kernel parameters).
Below is a list of all the kernel parameters:
Name Location Current Default
---- ---------- ---------- ----------
kernel.AutoconfigDriverFirst 0x0003b52d30 0x00000000 0x00000000
kernel.BufferCacheHitRatio 0x0002093108 0x00000050 0x00000050
kernel.MSIXdebug 0x0002094714 0x00000001 0x00000001
kernel.MSIXenable 0x000209471c 0x00000001 0x00000001
kernel.MSI_NoStop 0x0002094710 0x00000001 0x00000001
kernel.MSIenable 0x0002094718 0x00000001 0x00000001
kernel.MsiRouting 0x0002094724 0x00000001 0x00000001
kernel.WatchDog 0x0003aeb4e0 0x00000001 0x00000001
kernel.autoreboot 0x0003a0aefc 0x00000258 0x00000258
kernel.bcmTimeoutFix 0x0002179920 0x00000002 0x00000002
kernel.buffersWatermarkPercentage 0x0003ae964c 0x00000021 0x00000021
kernel.bufreclaim 0x0003ae9640 0x00000001 0x00000001
kernel.canRunRT 0x000208f7a0 0xffffffff 0xffffffff
kernel.dumpcompress 0x000208f794 0x00000001 0x00000001
kernel.enableFCFastInit 0x00022c29d4 0x00000001 0x00000001
kernel.enableWarmReboot 0x000217ee68 0x00000001 0x00000001
kernel.forceWholeTLBflush 0x00039d0900 0x00000000 0x00000000
kernel.heapHighWater 0x00020930c8 0x00004000 0x00004000
kernel.heapLowWater 0x00020930c4 0x00000080 0x00000080
kernel.heapReserve 0x00020930c0 0x00022e98 0x00022e98
kernel.highwatermakpercentdirty 0x00020930e0 0x00000064 0x00000064
kernel.lockstats 0x0002093128 0x00000001 0x00000001
kernel.longLivedChunkSize 0x0003a23ed0 0x00002710 0x00002710
kernel.lowwatermakpercentdirty 0x0003ae9654 0x00000000 0x00000000
kernel.mallocHeapLimit 0x0003b5558c 0x00040000 0x00030000 (This is the parameter I changed)
kernel.mallocHeapMaxSize 0x0003b55588 0x00000000 0x00000000
kernel.maskFcProc 0x0002094728 0x00000004 0x00000004
kernel.maxSizeToTryEMM 0x0003a23f50 0x00000008 0x00000008
kernel.maxStrToBeProc 0x0003b00f14 0x00000080 0x00000080
kernel.memSearchUsecs 0x000208fa28 0x000186a0 0x000186a0
kernel.memThrottleMonitor 0x0002091340 0x00000001 0x00000001
kernel.outerLoop 0x0003a0b508 0x00000001 0x00000001
kernel.panicOnClockStall 0x0003a0cf30 0x00000000 0x00000000
kernel.pciePollingDefault 0x00020948a0 0x00000001 0x00000001
kernel.percentOfFreeBufsToFreePerIter 0x00020930cc 0x0000000a 0x0000000a
kernel.periodicSyncInterval 0x00020930e4 0x00000005 0x00000005
kernel.phTimeQuantum 0x0003b86e18 0x000003e8 0x000003e8
kernel.priBufCache.ReclaimPolicy 0x00020930f4 0x00000001 0x00000001
kernel.priBufCache.UsageThreshold 0x00020930f0 0x00000032 0x00000032
kernel.protect_zero 0x0003aeb4e8 0x00000001 0x00000001
kernel.remapChunkSize 0x0003a23fd0 0x00000080 0x00000080
kernel.remapConfig 0x000208fe40 0x00000002 0x00000002
kernel.retryTLBflushIPI 0x00020885b0 0x00000001 0x00000001
kernel.roundRobbin 0x0003a0b504 0x00000001 0x00000001
kernel.setMSRs 0x0002088610 0x00000001 0x00000001
kernel.shutdownWdInterval 0x0002093238 0x0000000f 0x0000000f
kernel.startAP 0x0003aeb4e4 0x00000001 0x00000001
kernel.startIdleTime 0x0003aeb570 0x00000001 0x00000001
kernel.stream.assert 0x0003b00060 0x00000000 0x00000000
kernel.switchStackOnPanic 0x000208f8e0 0x00000001 0x00000001
kernel.threads.alertOptions 0x0003a22bf4 0x00000000 0x00000000
kernel.threads.maxBlockedTime 0x000208f948 0x00000168 0x00000168
kernel.threads.minimumAlertBlockedTime 0x000208f94c 0x000000b4 0x000000b4
kernel.threads.panicIfHung 0x0003a22bf0 0x00000000 0x00000000
kernel.timerCallbackHistory 0x000208f780 0x00000001 0x00000001
kernel.timerCallbackTimeLimitMSec 0x000208f784 0x00000003 0x00000003
kernel.trackIntrStats 0x000209021c 0x00000001 0x00000001
kernel.usePhyDevName 0x0002094720 0x00000001 0x00000001