Page 562 - Handbook of Modern Telecommunications
P. 562

Network Organization and Governance                                        4-93

              •   Data correction
              •   Correlation with manually collected data for some metrics
              These corrections do not change the contract, but help to solve single and sporadic deviations. They
            represent a response for customer complaints over short periods of time.
              Reimbursement for noncompliance with SLAs: If noncompliance happens, measures will be agreed
            upon periodically. Review periods should correspond to billing periods. This process step is heavily cor-
            related with accounting.
              Changes in SLAs: When the number of corrections and noncompliance cases is over a predefined
            threshold, SLAs are expected to be reviewed and if necessary then changed.
              SLM requires that multiple QoS metrics are continuously supervised and measured. Depending on
            the agreements between clients and service providers, reports may be generated and distributed, or/and
            information on Web servers prepared.
              Data sources for SLM include:
              •   Trouble tickets that are generated automatically or are prepared manually
              •   Events that are generated by network elements (managed objects), filtered, modified, and classified
              •   Alarms (SNMP traps or alarms from other sources) represent a specific class of events
              •   Logs of systems and network components
              •   Performance metrics, provided by various tools
              •   Manual logs based on observations
              The middle part of Figure 4.5.11 can be further detailed. The collected data are unified and converted
            into a common denominator (Figure 4.5.12). No complex processing is expected here. The output is a
            special table (see Table 4.5.1) with a number of different events that are utilized to supervise SLAs. In the
            first step, separate tables are generated for each service provider.
              Table 4.5.1 is the basis for triggering escalation steps. Alarms are derived from events. Basically, the
            following hierarchy is valid:

              •   Managed objects
              •   Status notification
              •   Filtering processes
              •   Generation of alarms
              •   Generation of notifications
              •   Distribution of alarms and notifications
              But problems must be classified first. There are usually three classes:

              •   Critical problems
              •   Principal problems
              •   Noncritical problems
              The  peering  agreement  includes  clarification  and  unification  of  these  problem  classes  and  their
            actual content.
              Escalation steps must be clearly defined for each problem class in advance. The preparation may look
            like the following:

              •   Definition of emergencies: Emergency is when critical problems occur or when multiple problems
                 occur with a certain combination. In the second case, not all problems must be critical.
              •   Determining the escalation layer: The number of layers depends on the organization of the par-
                 ticipating service providers and operators. Usually, two layers are defined for the customers and
                 one for each service provider and operator.
              •   Identification of persons: For each layer, subject matter experts are named; also site and reach
                 numbers are identified.
   557   558   559   560   561   562   563   564   565   566   567