Advanced analytics

In this section, we introduce some exemplary use-cases of the Airlock Anomaly Shield analytics tool.

 
Example

For a complete list of the airlock-ml-analytics tool capabilities use --help:

 
Terminal box
cd /opt/airlock/ml-service/bin 
./airlock-ml-analytics --help
 
Info

Also, check out Log messages and actions to examine log messages created by Airlock Anomaly Shield.

Check trained ML models

  1. To verify that the models for a given application are trained, use the following command:
  2.  
    Terminal box
    cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics mod_info
    
  3. A list with mod_info of the collected data will be displayed:
  4.  
    Terminal box
    {"aid": "BookShop", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]}
    {"aid": "OtherApp", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]}
    
  5. This shows the trained applications per line and the trained models per application.

  6.  
    Notice

    The indicator_count should always be 6. If this is not the case, some machine learning models failed to be trained, most likely due to a lack of sufficient divergent data.

Find sessions that produce high anomaly ratings

It might be interesting to check which session metrics would result in a high anomaly rating (e.g. “bitcount”: “>= 3”) before enabling enforcement rules.

  1. To search for sessions in our BookShop example that have >= 3 active anomaly indicators, we run:
  2.  
    Terminal box
    cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics --application "BookShop" --start "2022-06-01" --end "2022-06-07" apply_models --query '{"bitcount": ">= 3"}' --output pattern bitcount
  3. A note on the above request: It is always advisable to restrict the time frame and select an application to reduce the data processing time.

  4. The output will be something like this:
  5.  
    Terminal box
    ... 
    {"aid": "BookShop", "sid": "1b7f90241ac001f10603cd2df0ea6d80", "pattern": {"ConnectionMetrics": 1, "GraphMetricsCluster": 0, "IsolationForest": 0, "MultipleCountries": 1, "StatusCodeMeta": 1, "TimingCluster": 0}} 
    ...
  6. From here, you may want to analyze the suspicious session using Logviewer.
    • Use the session ID (sess_id) for a Kibana search for a detailed view.
    • Alternatively: Use the session ID (sess_id) with the airlock-elasticsearch-query tool for a quick view.
      The query may look like follows:
    •  
      Terminal box
       
      airlock-elasticsearch-query --query 'log_id: WR-SG-SUMMARY AND sess_id: 1b7f90241ac001f10603cd2df0ea6d80' --fields @timestamp http_method entry_path http_status
 
Info

The apply_models subcommand is quite versatile, please consult the tools to help to see the full capability:

 
Terminal box
./airlock-ml-analytics apply_models --help

Readout the distribution ratio of anomaly patterns in percent

Use the sess_stat command option to generate a list of anomaly patterns that show the distribution sorted in percent per application.

  1. In our example, we also limited the output with a time range to speed up the computation:
  2.  
    Terminal box
     cd /opt/airlock/ml-service/bin 
    ./airlock-ml-analytics sess_stat --application "BookShop" --start "2022-06-01" --end "2022-06-07" sess_stat
    
  3. The output will be something like this:
  4.  
    Terminal box
    pattern  bitcount  count     ratio 
    ... 
    4  {'ConnectionMetrics': 1, 'GraphMetricsCluster': 1, 'IsolationForest': 0, 'MultipleCountries': 1, 'StatusCodeMeta': 1, 'TimingCluster': 0}         5      4  0.003980 
    5  {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 1, 'TimingCluster': 0}         1      5  0.004975 
    6  {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 0, 'TimingCluster': 0}         0    986  0.981095 
    

How to read the above output:

  • Line number 4:
    4 sessions are reported to have a pattern bit count of 5 for the number of counted sessions which applies to 0.39% of all counted sessions. This may be a pattern that could be used in an enforcement rule.
  •  
    Notice

    Sessions with a bitcount higher 2 should be analyzed thoroughly. See Find sessions that produce high anomaly ratings.

  • Line number 5:
    5 sessions are reported to have a pattern bit count of 1 for the number of counted sessions which applies to 0.49% of all counted sessions. This pattern does most likely do not require further investigation or an enforcement rule.
  • Line number 6:
    986 sessions are reported to have a pattern bit count of 0 for the number of counted sessions which applies to 98.1% of all counted sessions. This pattern does not show any anomaly count at all.