Using the analytics tool for advanced analytics

In this section, we introduce some exemplary use-cases of the Airlock Anomaly Shield analytics tool.

For a complete list of the airlock-ml-analytics tool capabilities use --help:

copy
cd /opt/airlock/ml-service/bin 
./airlock-ml-analytics --help

Also, check out Log messages and actions of Airlock Anomaly Shield to examine log messages created by Airlock Anomaly Shield.

Check trained ML models

  1. To verify that the models for a given application are trained, use the following command:
  2. copy
    cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics mod_info
    
  3. A list with mod_info of the collected data will be displayed:
  4. copy
    {"aid": "BookShop", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]}
    {"aid": "OtherApp", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]}
    

    This shows the trained applications per line and the trained models per application.

    The indicator_count should always be 6. If this is not the case, some machine learning models failed to be trained, most likely due to a lack of sufficient divergent data.

Find sessions that produce high anomaly ratings

It might be interesting to check which session metrics would result in a high anomaly rating (e.g. "bitcount": ">= 3") before enabling enforcement rules.

  1. To search for sessions in our BookShop example that have >= 3 active anomaly indicators, we run:
  2. copy
    cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics --application "BookShop" --start "2022-06-01" --end "2022-06-07" apply_models --query '{"bitcount": ">= 3"}' --output pattern bitcount

    A note on the above request: It is always advisable to restrict the time frame and select an application to reduce the data processing time.

  3. The output will be something like this:
  4. copy
    ... 
    {"aid": "BookShop", "sid": "1b7f90241ac001f10603cd2df0ea6d80", "pattern": {"ConnectionMetrics": 1, "GraphMetricsCluster": 0, "IsolationForest": 0, "MultipleCountries": 1, "StatusCodeMeta": 1, "TimingCluster": 0}} 
    ...
  5. From here, you may want to analyze the suspicious session using Logviewer.
    • Use the session ID (sess_id) for a Kibana search for a detailed view.
    • Alternatively: Use the session ID (sess_id) with the airlock-elasticsearch-query tool for a quick view.
      The query may look like follows:
    • copy
       
      airlock-elasticsearch-query --query 'log_id: WR-SG-SUMMARY AND sess_id: 1b7f90241ac001f10603cd2df0ea6d80' --fields @timestamp http_method entry_path http_status

The apply_models subcommand is quite versatile, please consult the tools to help to see the full capability:

copy
./airlock-ml-analytics apply_models --help

Readout the distribution ratio of anomaly patterns in percent

Use the sess_stat command option to generate a list of anomaly patterns that show the distribution sorted in percent per application.

  1. In our example, we also limited the output with a time range to speed up the computation:
  2. copy
     cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics sess_stat --application "BookShop" --start "2022-06-01" --end "2022-06-07" sess_stat
    
  3. The output will be something like this:
  4. copy
    pattern  bitcount  count     ratio 
    ... 
    4  {'ConnectionMetrics': 1, 'GraphMetricsCluster': 1, 'IsolationForest': 0, 'MultipleCountries': 1, 'StatusCodeMeta': 1, 'TimingCluster': 0}         5      4  0.003980 
    5  {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 1, 'TimingCluster': 0}         1      5  0.004975 
    6  {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 0, 'TimingCluster': 0}         0    986  0.981095 
    
  • How to read the above output:
  • Line number 4:
    4 sessions are reported to have a pattern bit count of 5 for the number of counted sessions which applies to 0.39% of all counted sessions. This may be a pattern that could be used in an enforcement rule.
  • Sessions with a bitcount higher 2 should be analyzed thoroughly. See Find sessions that produce high anomaly ratings.

  • Line number 5:
    5 sessions are reported to have a pattern bit count of 1 for the number of counted sessions which applies to 0.49% of all counted sessions. This pattern does most likely do not require further investigation or an enforcement rule.
  • Line number 6:
    986 sessions are reported to have a pattern bit count of 0 for the number of counted sessions which applies to 98.1% of all counted sessions. This pattern does not show any anomaly count at all.