• Server lock ups, any ideas?

    By Doug Finner 4 years ago

    If audit manager is running on a server and the backup system (Arcserve for us) attempts to backup auditmanager.nsf, there is a file lock contention, the server starts throwing semaphore errors. We dropped this file from the backup and the problem went away. We also had ntrigger.dll and notes.ini excluded just to be totally safe.

    Time passes...we migrate the server to a VM environment. We are now using Networker as the backup tool and I've been assured that these files are excluded from the backup schema. We have also verified that shadow copies are not being created.

    We're on AM 2.0.0, have about 180 audits configured (some as stand alone dbs, some with the audits stored in the source db). Windows server 2008 R2.

    The semaphore errors are now back with a frequency of once or twice a month. The Windows logs don't provide much guidance. The backup guy swears he's excluding the required files..

    Is there anything within the AM environment that could be causing file contentions with AuditManager.nsf? Have we exceeded the capacity of AM and it's just killing itself?

    Any ideas would be greatly appreciated. This tool has been a HUGE help with our regulatory compliance and with debugging workflow issues. Thanks in advance for any tips on how to proceed.

    • By Doug Finner 4 years ago

      Follow up - I re-read the thread from 4 years back.

      The auditmanager.nsf db is FTIndexed but I don't see that the indexer is doing any up updates routinely and never linked to a crash. I have deleted the FTI just in case.

      There don't seem to be any major agent runs associated with the crashes. The log file shows a variety of events just prior to the lockup and none of events seem to be linked to AM.nsf.

      Updall runs at 2 am; lockups seem to happen between 3 and 8 am most often.

      Here is the console log output for the most recent lockup.

      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer is indexing Database 'Dev\CPCRRelRevAE.nsf'

      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer finished indexing Database 'Dev\CPCRRelRevAE.nsf'
      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer is indexing Database 'Kmc2\Cascade\Processes\Archives\CascadeCRDn.NSF'
      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer finished indexing Database 'Kmc2\Cascade\Processes\Archives\CascadeCRDn.NSF'
      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer is indexing Database 'Dev\CMArc.nsf'
      [09D4:0005-106C] 08/21/2015 03:25:36 AM Domain Indexer finished indexing Database 'Dev\CMArc.nsf'
      [0FB8:0057-0CA8] 08/21/2015 03:26:11 AM Opened session for George User/Kollsman (Release 8.5.3)
      ti=“0028E7DE-85257EA8” sq=“00032267” THREAD [0FB8:0057-16CC] WAITING FOR WRITE LOCK ON FRWSEM 0x02B0 Zero-user database cache (@0BFE209C) (R=0,W=1,WRITER=2C7C:2CF0,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028E7E3-85257EA8” sq=“00032268” THREAD [0FB8:0043-1688] WAITING FOR WRITE LOCK ON FRWSEM 0x02B0 Zero-user database cache (@0BFE209C) (R=0,W=1,WRITER=2C7C:2CF0,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028E7E9-85257EA8” sq=“00032269” THREAD [1368:0002-1364] WAITING FOR READ LOCK ON FRWSEM 0x4245 open database queue semaphore (@0C380348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028E825-85257EA8” sq=“0003226B” THREAD [0764:000D-0D94] WAITING FOR READ LOCK ON FRWSEM 0x4245 open database queue semaphore (@080B0348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028E82E-85257EA8” sq=“0003226C” THREAD [1F44:0002-2FE4] WAITING FOR READ LOCK ON FRWSEM 0x4245 open database queue semaphore (@07EF0348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028EA19-85257EA8” sq=“0003226E” THREAD [09D4:0008-168C] WAITING FOR READ LOCK ON FRWSEM 0x4245 open database queue semaphore (@080B0348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028EA29-85257EA8” sq=“0003226F” THREAD [1218:0002-1294] WAITING FOR READ LOCK ON FRWSEM 0x4245 open database queue semaphore (@0CA70348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028EC05-85257EA8” sq=“00032270” THREAD [0FB8:0061-0D50] WAITING FOR WRITE LOCK ON FRWSEM 0x4245 open database queue semaphore (@0BFE0348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      ti=“0028EC54-85257EA8” sq=“00032271” THREAD [0FB8:0019-13C0] WAITING FOR WRITE LOCK ON FRWSEM 0x4245 open database queue semaphore (@0BFE0348) (R=0,W=1,WRITER=0FB8:16CC,1STREADER=0000:0000) FOR 30000 ms
      LkMgr BEGIN Long Held Lock Dump ——————
      [0FB8:0051-169C] Lock(Mode=IS * LockID(DB DB=D:\Domino\Data\auditmanager.nsf)) Waiters countNonIntentLocks = 0 countIntentLocks = 3, queuLength = 1
      [0FB8:0051-169C] Req(Status=Granted Mode=IS Class=Commit Nest=0 Cnt=1(Hold - count should be 1)
      [0FB8:0051-169C] Tran=0 Func=N/A dbopen.c:4836 [0FB8:0057-16CC])

      [0FB8:0051-169C] rm_lkmgr_cpp:2070
      [0FB8:0051-169C] rm_lkmgr_cpp:1306
      [0FB8:0051-169C] nsfsem1_c:533
      [0FB8:0051-169C] dbfileio_c:1045
      [0FB8:0051-169C] ntlock_c:62
      [0FB8:0051-169C] nsfsem3_c:2353
      [0FB8:0051-169C] Req(Status=Converting Mode=IS CnvMode=X Class=Manual Nest=0 Cnt=2
      [0FB8:0051-169C] Tran=0 Func=N/A m\lkmgr.cpp:159 [2C7C:0002-2CF0] Delay=0min)

      [0FB8:0051-169C] rm_lkmgr_cpp:2070
      [0FB8:0051-169C] rm_lkmgr_cpp:1306
      [0FB8:0051-169C] nsfsem1_c:169
      [0FB8:0051-169C] nsfsem1_c:139
      [0FB8:0051-169C] LkMgr END Long Held Lock Dump ——————