PROJECT OVERVIEW

AI-Infused Vulnerability Management Leveraging Databricks

blue-arrow

When managing big data to obtain information about networks, it is customary to use such tools as Nmap, which generates XML reports. In the case of security audits, intrusion testers typically face hundreds – or even thousands – of assets early in an engagement. Utilizing AI/ML, we can substitute the process of manually sifting through mountains of entries in a network security assessment to extract the most valuable targets.

TECHNOLOGY
Databricks delivers a unified analytics platform powered by Apache Spark. Databricks removes all the hardness and complexity to get a Spark cluster. They provide a seamless, zero management, Spark experience thanks to the integration with major cloud providers. Apache Spark is an open-source cluster-computing framework. Spark is a scalable, massively parallel, in-memory execution environment for running analytics applications. Python and Scala languages are supported, and notebook can mix both.

METHODOLOGY
Unsupervised AI/ML anomaly detection techniques detect anomalies in an unlabeled test data set under the assumption that the majority of the instances in the dataset are normal by looking for instances that seem to fit least to the remainder of the data set.

PROCESS
ETL Pipeline for Asset Detection, raw data parser from semi-structured data into structured format, data preprocessing and refinement and feature engineering. ML model engineering and data training, MLOps (deployment and maintenance machine learning models in production).

blue-arrow