Page 460 - Handbook of Modern Telecommunications
P. 460
Network Management and Administration 3-251
OSS/BSS Process & Technology Integration
Assurance
Product lifecycle management
Service offer monitoring
Service monitoring
Fulfillment Incident & Service quality management Usage
Consolidated operations
problem
management
Domain management
Test & Fault Performance
diagnostics management management
Assurance functionality
FIGu RE 3.10.5 Assurance functions.
that a particular domain is performing as expected. Domains are typically structured by the technology
being managed, but may also be separated organizationally or for regulatory reasons.
The most fundamental aspect of resource management is knowing whether the various infrastruc-
ture components are working or not working. This is the role of fault management. It provides for the
collection and correlation of alarms and other relevant events to provide an accurate view of the health
of the infrastructure.
The lack of faults does not necessarily mean that the infrastructure is running properly. Though the
infrastructure may be functioning, it may not be performing; the load on it may be such that it is being
asked to do more than it can. This is where performance management comes in.
Performance management involves the collection and analysis of performance data. The data can be
collected from performance counters in the equipment or in element managers. It may also be collected
from instrumentation added in the form of probes. These probes could be passive (monitoring activi-
ties and taking measurements) or active (simulating a demand for service and measuring the pertinent
response times).
Real-time performance management monitors the collected data, making sure it is within prespeci-
fied parameters. Whenever the data crosses the predefined thresholds, an event is generated to the fault
management system. In addition to the real-time nature of performance management, the data is also
used to identify trends and create forecasts.
In addition to monitoring the state and performance of the infrastructure, it is also important to be
able to run tests and diagnostics, both to get further information that can be used to better understand
faults or poor performance as well as to verify that all the components respond as they should. This
verification process can be a desirable last step of a provisioning process.
Finally, incident and problem management supports the resolution of any incidents and problems.
We use the ITIL terminology here rather than a more traditional telecom terminology since it brings
additional clarity. Incidents are the events that cause or may cause a disruption in a service or its quality.
A problem is the root cause behind one or more actual or potential incidents. For example, an incident
could be “excessive retransmissions on a link”; the problem could be “failing board” or “heavy rain
causes microwave signal degradation.” The goal of incident management is to resolve the incident as
quickly as possible. The goal of problem management is to resolve the problem so that it no longer exists
or, if this is not possible, capture the necessary knowledge so that any incidents that do occur can be
identified and resolved faster.