Microsoft Knowledge Base Email Alertz

Agents frequently fail over to another MOM 2005 Management Server

Search KbAlertz

Advanced Search

Receive Microsoft Knowledge Base articles by E-Mail?

Every night we scan the Microsoft Knowledge Base. If technologies you're interested in are updated, we'll send you an e-mail. You only get one e-mail a day, and only when new articles are added.

Click here to create a
FREE account
Already have an account?
[Click here to Login]











Microsoft Knowledge Base Article

This article contents is Microsoft Copyrighted material.
©2005-©2007 Microsoft Corporation. All rights reserved. Terms of Use | Trademarks

Article ID: 892920 - Last Review: July 24, 2007 - Revision: 2.5

Agents frequently fail over to another MOM 2005 Management Server

Hotfix download is availableHotfix Download Available
View and request hotfix downloads


This article replaces Microsoft Knowledge Base article 892140.

Important This article contains information about how to modify the registry. Make sure to back up the registry before you modify it. Make sure that you know how to restore the registry if a problem occurs. For more information about how to back up, restore, and modify the registry, click the following article number to view the article in the Microsoft Knowledge Base:
256986  (http://kbalertz.com/Feedback.aspx?kbNumber=256986/ ) Description of the Microsoft Windows registry

On This Page

SYMPTOMS

You experience one or more of the following symptoms:
  • Symptom 1

    On a source Microsoft Operations Manager (MOM) 2005 Management Server, the Problem State of an alert that changes from Active to Inactive appears as Active on the destination Management Server.

    This symptom occurs when one of the following conditions is true:
    • You are using a MOM 2005 product connector. The alert status is not forwarded to the destination Management Server.
    • You are using an alert rule with a response, such as a script response. The alert rule may have any response defined.
  • Symptom 2

    MOM agents frequently fail over to another Management Server and fail back to the primary Management Server. Agents fail over before they retry communication to the primary Management Server. When this symptom occurs, events 21249 and 21250 are logged in the Application log. Event 21249 indicates that the agent failed over to another Management Server. Event 21250 indicates that the agent has re-established a connection to the primary Management Server.
  • Symptom 3

    After you install hotfix 892140, hotfix 885416 is removed from the Management Server.

CAUSE

Symptom 1 occurs because of a problem in the MOM alert processing workflow. Currently, alerts are filtered before they are processed. Therefore, a new alert may be discarded before the alert is processed.

For example, the alert filtering node discards the updated alert after the following sequence of events:
  1. An event processing rule creates an alert.
  2. The alert is processed by the alert filtering node of the workflow.
  3. The alert's last known state is saved.
  4. The alert processing node of the workflow sees that the client-side responses are updated and puts the alert at the start of the workflow.
  5. The alert is processed again by the alert workflow. However, the alert filtering node processes the alert as a new alert that has the same state as the first alert that was processed.
Symptom 2 may occur on networks that are experiencing intermittent network problems.

RESOLUTION

Service pack information

To resolve this problem, obtain the latest service pack for Microsoft Operations Manager 2005. For more information, click the following article number to view the article in the Microsoft Knowledge Base:
905416  (http://kbalertz.com/Feedback.aspx?kbNumber=905416/ ) How to obtain the latest Microsoft Operations Manager 2005 service pack

Hotfix information

A supported hotfix is available from Microsoft. However, this hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing this specific problem. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next software update that contains this hotfix.

If the hotfix is available for download, there is a "Hotfix download available" section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.

Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the following Microsoft Web site:
http://support.microsoft.com/contactus/?ws=support (http://support.microsoft.com/contactus/?ws=support)
Note The "Hotfix download available" form displays the languages for which the hotfix is available. If you do not see your language, it is because a hotfix is not available for that language.

Prerequisites

No prerequisites are required.

Restart requirement

This hotfix stops and then restarts the MOM service.

Hotfix replacement information

This hotfix replaces hotfix 892140.

File information

The English version of this hotfix has the file attributes (or later file attributes) that are listed in the following table. The dates and times for these files are listed in Coordinated Universal Time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time tool in Control Panel.
   Date         Time   Version            Size    File name
   --------------------------------------------------------------
   29-Apr-2005  14:39                     51,944  Eemengine.tlb
   29-Apr-2005  14:39  5.0.2749.20       899,824  Momactions.dll   
   29-Apr-2005  14:39  5.0.2749.20     1,530,608  Momdbconnector.dll  
   29-Apr-2005  14:39  5.0.2749.20     1,629,936  Momengine.dll    

   25-Apr-2005  18:52                     51,944  Eemengine.tlb    IA-64
   25-Apr-2005  18:52  5.0.2749.20     3,186,928  Momactions.dll   IA-64
   25-Apr-2005  18:52  5.0.2749.20     5,997,296  Momengine.dll    IA-64

Hotfix installation information

Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.

Install this hotfix on all MOM 2005 Management Servers and manually-installed agents. To do this, follow these steps:
  1. Copy the MOM2005-RTM-KB892920-X86-IA64-ENU.exe file to a temporary folder on the Management Server or on the manually-installed agent.
  2. Run the MOM2005-RTM-KB892920-X86-IA64-ENU.exe file, and then click Apply Microsoft Operations Manager 2005 Software Update. Make sure that you review the Release Notes before you install this hotfix. To review the Release Notes, click Review this Software Update Release Notes in the hotfix installer wizard.
After you install this hotfix, automatically-installed agents are added to the Pending Actions node of the MOM Administrator console. The Pending Action value is Requires Patching. Approve these agents, and then run a computer discovery cycle to install the hotfix on these agents. To do this, follow these steps:
  1. In the MOM 2005 Administrator console, expand Microsoft Operations Manager, expand Administration, and then click Pending Actions.
  2. Right-click a computer, and then click Approve for Processing by Computer Discovery.
After you install this hotfix, transient network issues will not cause failover or fallback loops. You can configure the delay interval and the number of times that the agent tries to connect to the primary Management Server. The agent will fail over if the agent still cannot connect to the primary Management Server. By default, the agent tries to connect three times at 1000-millisecond intervals.

To configure these settings, create and configure the ConnectRetryAttempts value and the ConnectRetryIntervalMS value under the following registry subkey:
HKEY_LOCAL_MACHINE\SOFTWARE\Mission Critical Software\OnePoint\Configurations\<ManagementGroup>\Operations\Agent\Consolidators
Note <ManagementGroup> is the name of your MOM management group.

To create these values, follow these steps:
  1. Click Start, click Run, type regedit, and then click OK.
  2. Locate and then right-click the following registry subkey:
    HKEY_LOCAL_MACHINE\SOFTWARE\Mission Critical Software\OnePoint\Configurations\<ManagementGroup>\Operations\Agent\Consolidators
  3. Point to New, and then click DWORD Value.
  4. Type ConnectRetryAttempts, and then press ENTER.
  5. Double-click ConnectRetryAttempts, type a number to indicate how many times the agent must retry the Management Server before failing over. By default, the agent will retry the connection three times.
  6. Click OK.
  7. Right-click the Consolidators registry key, point to New, and then click DWORD Value.
  8. Type ConnectRetryIntervalMS, and then press ENTER.
  9. Double-click ConnectRetryIntervalMS, type a number in milliseconds to indicate how long the agent must wait before it retries the connection. By default, the agent waits 1000 milliseconds.
  10. Click OK.

STATUS

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section. This problem was first corrected in Microsoft Operations Manager 2005 Service Pack 1.

MORE INFORMATION

Symptom 1 is discussed in the "Forwarded Alerts States are not Being Changed to Inactive" section of the hotfix Release Notes. However, the workaround that is discussed in the Release Notes is incorrect for this problem.

For more information about other problems that are resolved by Microsoft Operations Manager 2005 Service Pack 1, click the following article number to view the article in the Microsoft Knowledge Base:
905420  (http://kbalertz.com/Feedback.aspx?kbNumber=905420/ ) List of bugs that are fixed in Microsoft Operations Manager 2005 service packs
For more information, click the following article number to view the article in the Microsoft Knowledge Base:
824684  (http://kbalertz.com/Feedback.aspx?kbNumber=824684/ ) Description of the standard terminology that is used to describe Microsoft software updates

APPLIES TO
  • Microsoft Operations Manager (MOM) 2005
Keywords: 
kbhotfixserver kbautohotfix atdownload kbgetsp kbbug kbnetwork kbopmaneventmgmt kbopmanalerts kbopman2005bug kbqfe kbevent kbperformance kbfix KB892920
       

Community Feedback System

Very often, it takes hours to solve a problem. Very often, you've looked high and low, and have tried a lot of solutions. When you finally found it, chances are, it was because someone else helped you. Here's your chance to give back. Use our community feedback tool to let others know what worked for you and what didn't.

Please also understand that the community feedback system is not warranted to be correct, it's simply a system that we've built to let people try and help each other. If something in a feedback response doesn't make sense to you, or you're not comfortable making changes that the feedback talks about (like registry edits), please consult a professional.

Thank you for using kbAlertz.com Feedback System.

-- Scott Cate