Page 23 - info_oct_2021_draft13
P. 23

Leveraging Big Data &  Security Analytics Platform
           The security analytics platform is envisioned
          to handle data in the scale of petabytes and it
          should be scalable. In this context, Elastic Search
 AI-ML for Security  The elastic search can facilitate the correlation/
          and Hadoop can be used as the backend data lake.
          alert  rules,  dashboards  and  analytics.  Whereas,
          Hadoop can facilitate the machine learning
          analysis, through additional tools like python,
 Analytics  spark. the primary source of data to be ingested
          into the platform would be the logs generated by
          various devices, servers, endpoints, applications,
          websites and services. The logs may be collected
          from various sources across the Government ICT
          Infrastructure connected to NICNET and the logs
          shall be processed and enriched with additional
          details (like Geo-location, IP/ Domain Reputation,
          etc). The processed logs will then be analyzed on
          the analytics platform using various correlation
          and security rules. In addition to this, a machine
          learning model will also process the logs and will
          try to identify various anomalies and suspicious
          patterns  in  the  logs. Multiple Machine Learning   proactive i.e., aid in predictive analytics.  of log events and filter out those events which
          models may be integrated into the security                             could be of interest from a security perspective
          analytics Platform, each ML Model will have AI- Features of the Security Analytics   and it could further extrapolate the relationship
          ML Models for Security Analytics the capability to  Platform           between a particular event and a whole plethora
          train and learn, where by it attains certain level   Some of the key features of the platform are   of other related data sets, thereby providing
          of maturity over a period of time. Once the ML   as follows:           tactical insights essential for strengthening our
          Model attains the maturity level, it can spot much   •  Central Aggregated Log Management Platform  cyber security posture.
          more advanced and complex attacks, which may                             The security analytics platform can also
          not be spotted by the traditional rule based SIEM   •  Web and Security Analytics  provide key web analytics on the site traffic, visitor
          platforms.                         •  Visibility on Attacker Activity  stats, suspicious hits, etc. The insights generated
                                             •  Detect/ Predict Anomalies or Attacks at an   from one log source can also be correlated with
          AI-ML in Advanced Security           early stage                       another log source to check for any similarities.
                                                                                 For example, an attacker who has attempted to
          Analytics and Threat Detection     •  Incident Response, Threat Hunting & Threat   hack into one state government website, was
           AI-ML  has  become  the  buzzword  in  recent   Intelligence          again found to be attempting to hack another
          times. Most of the new technology products  •  Facilitate  troubleshooting  of  website/  central government ministry’s website. This is
          claims to leverage AI-ML in one way or the other.   application issues  where the analytics platform, will try to inter-
          Inspite of all the buzz and being touted as the next   •  Dashboard & Reporting  relate both the attacks based on various features
          big thing in the technology evolution, the journey                     and attributes, and further the model would try to
          towards achieving successful results through AI-                       learn the techniques adopted by the attacker for
          ML is an arduous task; Especially, when it comes                       launching the attack. The learning would then be
          to cyber security and threat/ attack detection,                        ingrained into the model and it could train itself
          it would require billions of data events to train   Tactical Insights & Security   for detecting similar such attacks in the future.
          the model appropriately, so that it can achieve a  Posture
          certain degree of accuracy.          From an ICT perspective, the logs of a system
           The  classification model under  supervised   are literally a piece of recorded history of what
          learning can be built around knowledge of   happened  on  the  system,  when  it happened,   Key Benefits of the Security
          known classifier objects such as IP addresses,   this information can be further inferenced to   Analytics Platform
          domain names, network object interactions, and   identify how the specific event happened and   The security analytics platform is powered
          other data points, which are extracted from the   why it happened. Considering an organization   by a massive data lake at the backend, which is
          logs. This can further be used to build various   like NIC, which hosts thousands of websites,   essentially a repository of log data collected from
          classification models which can be tested and   applications and not to mention the lakhs of ICT   various sources. The platform can be leveraged to
          adopted based on classification accuracies   devices spread across the country, collection and   ask various questions by querying the underlying
          and relevance. Unsupervised learning can be   aggregation  of  logs  from  these  devices  in  itself   data to get necessary information. In addition to
          leveraged for better grouping of clusters, where   poses a huge challenge. But if we overcome the   this, the platform can also offer the following key
          various clustering  algorithms  need to be  used   challenge  and  are  able  to  aggregate  the  data,   benefits:
          to identify and quantify data relationships from   then the insights that could be derived from the
          data and meta-data extracted from the logs.   aggregated logs would be invaluable. Since, its   •  Huge cost savings in the range of hundreds of
          Deep Learning neural networks can be used for   practically not possible for a human to physically   crores, which would have been incurred in a
          predicting anomalies in the data set gathered   check and investigate each log event, this is   corresponding commercial platform
          from  various log  sources.  One  of  the  key  focus   where automated security analytics and machine   •  Security incidents can be identified quickly and
          areas  of  the  security  analytics  platform  is  to   learning comes into picture; Together, the ML and   action can be taken before any major damage
          transform the security detection from reactive to   Security Analytics can quickly sift through billions   could be done



                                                                                        October 2021  informatics.nic.in 23
   18   19   20   21   22   23   24   25   26   27   28