DoIT Operational Framework – Section 6.0 - Event Management
Purpose – Event Management is
the procedural framework by which event monitoring is organized and executed
- Establish and document the monitoring framework we use for DoIT and the campus subscribers that use DoIT services.
- Describe our event management process, specifically, its process methodology, the collection and use of data collected via the monitors and, the integrations of event monitoring with other ITIL processes.
6.2 Roles and Responsibilities
Technical & Application management - The technicians (such as Systems
Engineering (SE)) and developers (such as Academic Navigator or Data Resource
Management Technology (DRMT)) may identify monitoring requirements. They may
also actually generate events from their analysis applications (such as Nagios
or Oracle Enterprise Manager) that, in turn, are used to create Event
IT Operations Management - Duty Manager, Managers that are On-Call on a rotating basis. When situations occur that exceed the scope of documented Event Management procedures, or simply are of a significant enough impact to warrant higher management attention, the Duty Manager is called to provide guidance.
Systems Network Control Center (SNCC) - the 24X7 staff that take action for problems requiring elevation. They create the problem entries in WiscIT for those automated events they receive on their Consolidated Console, their FIDO console, and problems transferred to them from the Help Desk.
Systems Management Event Monitoring Team - Developers and Administrators of the enterprise Event Management applications.
Event Management Subgroup – A forum meeting bi-weekly to address event management requirements and issues.
6.3 Event Management Framework
As defined in Section 2 of the Operational Framework, events are a change of state that has significance for the management of an IT service or other configuration item (CI). At the lowest level, events provide information to help manage the day to day operation of IT services. We will not discuss in this section the role events play at this lowest level as this is an operations management issue. The event processes identified in this section are focused on the events that have a higher likelihood of indicating an incident and/or problem. This may range from 1) simply sending an automated e-mail to increase awareness of an event, to 2) adding direct contact notifications for an event, to 3) invoking the Incident/ Problem process procedures of Section 4.0 of the Operational Framework.
6.3.1 Event Sources
Events can originate from a variety of sources:
- Dedicated event monitoring applications like Micro Focus Virtual User Generator (VuGen) or agent software of the event management server
- Operations management systems like Nagios XI or Oracle Enterprise Manager
- On demand cloud computing platforms via Rest APIs
6.3.2 Event Format for WiscIT
All monitoring events should have a reference Configuration Item (CI) in the Configuration Management Database (CMDB). The CI entry documents the support and notification information for the monitoring event and is the official reference source for the SNCC to handle those events that are elevated to their Consolidated Console view. Events elevated to WiscIT will include such details as addressees for e-mail notifications, level of notification required (that is, e-mail only, direct contact during working hours only or, direct contact 24x7), and event handling instructions by SNCC operators. The CMDB CI also contains a tab listing all the events received for that CI for archive reference.
6.3.3 Event Preprocessing before WiscIT
WiscIT is the main ITSM application which processes elevated events but it requires preprocessing through the event management server. This server does the following:
- Collects events from events sources via its own agent, rest API, or e-mail.
- Ensures the event has a valid CI record ID from the WiscIT CMDB.
- Buffers events if the WiscIT application is down.
- Filters out event updates that may be procedurally relevant for preprocessing but not operationally significant important for elevation.
- Suppress event storms from impacting WiscIT
- Simplify event correlation for WiscIT (i.e., automating the process to have a dashboard display only the most recent change of state of an event where possible)
6.3.4 Event Handling In WiscIT
The WiscIT application receives events from the event management server via a Rest API interface. The event is placed in an event table in WiscIT and handled according to the information from the reference CI. Regardless of any other handling, this event will be viewable thereafter from the event tab of its assigned CI. Nearly all events will also have an e-mail generated upon arrival sent to the Primary and Secondary administrators identified in the CI record plus any stakeholders in the CI stakeholders field identified to receive “Changes and Monitoring.” If the event is so deemed by the criteria in section 6.4, it may be elevated to the SNCC consolidated console for action by the SNCC staff per Incident/Problem management guidelines and additional instructions as specified in the Support tab of the event’s CMDB CI.
6.4 Event Management Requirements Process - Requirements for monitoring are officially submitted via Service Support Initiation or a submission of a WiscIT Monitoring Change Request (see https://kb.wisc.edu/helpdesk/13819). Other Event Management inquiries and requests may be handled by direct e-mail to the Systems Management Monitoring Team (email@example.com).
- DoIT Operational Framework - Section 1.0 - Overview
- DoIT Operational Framework - Section 2.0 - Glossary of Significant Terms
- DoIT Operational Framework - Section 3.0 - Change Management
- DoIT Operational Framework - Section 4.0 - Incident Management
- DoIT Operational Framework - Section 5.0 - Configuration Management
- Working with the Operational Framework (Policy)
- The DoIT Operational Framework, ITIL & Service Management Contacts at DoIT