Advanced analytics
In this section, we introduce some exemplary use-cases of the Airlock Anomaly Shield analytics tool.
For a complete list of the airlock-ml-analytics
tool capabilities use --help
:
Also, check out Log messages and actions to examine log messages created by Airlock Anomaly Shield.
Check trained ML models
- To verify that the models for a given application are trained, use the following command:
- Terminal box
cd /opt/airlock/ml-service/bin ./airlock-ml-analytics mod_info
- A list with
mod_info
of the collected data will be displayed: - Terminal box
{"aid": "BookShop", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]} {"aid": "OtherApp", "indicator_count": 6, "indicators": ["ConnectionMetrics", "GraphMetricsCluster", "IsolationForest", "MultipleCountries", "StatusCodeMeta", "TimingCluster"]}
This shows the trained applications per line and the trained models per application.
- Notice
The
indicator_count
should always be 6. If this is not the case, some machine learning models failed to be trained, most likely due to a lack of sufficient divergent data.
Find sessions that produce high anomaly ratings
It might be interesting to check which session metrics would result in a high anomaly rating (e.g. “bitcount”: “>= 3”
) before enabling enforcement rules.
- To search for sessions in our
BookShop
example that have>= 3
active anomaly indicators, we run: - Terminal box
cd /opt/airlock/ml-service/bin ./airlock-ml-analytics --application "BookShop" --start "2022-06-01" --end "2022-06-07" apply_models --query '{"bitcount": ">= 3"}' --output pattern bitcount
A note on the above request: It is always advisable to restrict the time frame and select an application to reduce the data processing time.
- The output will be something like this:
- Terminal box
... {"aid": "BookShop", "sid": "1b7f90241ac001f10603cd2df0ea6d80", "pattern": {"ConnectionMetrics": 1, "GraphMetricsCluster": 0, "IsolationForest": 0, "MultipleCountries": 1, "StatusCodeMeta": 1, "TimingCluster": 0}} ...
- From here, you may want to analyze the suspicious session using Logviewer.
- Use the session ID (
sess_id
) for a Kibana search for a detailed view. - Alternatively: Use the session ID (
sess_id
) with theairlock-elasticsearch-query
tool for a quick view.
The query may look like follows: - Terminal box
airlock-elasticsearch-query --query 'log_id: WR-SG-SUMMARY AND sess_id: 1b7f90241ac001f10603cd2df0ea6d80' --fields @timestamp http_method entry_path http_status
- Use the session ID (
The apply_models subcommand is quite versatile, please consult the tools to help to see the full capability:
./airlock-ml-analytics apply_models --help
Readout the distribution ratio of anomaly patterns in percent
Use the sess_stat
command option to generate a list of anomaly patterns that show the distribution sorted in percent per application.
- In our example, we also limited the output with a time range to speed up the computation:
- Terminal box
cd /opt/airlock/ml-service/bin ./airlock-ml-analytics sess_stat --application "BookShop" --start "2022-06-01" --end "2022-06-07" sess_stat
- The output will be something like this:
- Terminal box
pattern bitcount count ratio ... 4 {'ConnectionMetrics': 1, 'GraphMetricsCluster': 1, 'IsolationForest': 0, 'MultipleCountries': 1, 'StatusCodeMeta': 1, 'TimingCluster': 0} 5 4 0.003980 5 {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 1, 'TimingCluster': 0} 1 5 0.004975 6 {'ConnectionMetrics': 0, 'GraphMetricsCluster': 0, 'IsolationForest': 0, 'MultipleCountries': 0, 'StatusCodeMeta': 0, 'TimingCluster': 0} 0 986 0.981095
How to read the above output:
- Line number 4:
4
sessions are reported to have a pattern bit count of5
for the number of counted sessions which applies to0.39%
of all counted sessions. This may be a pattern that could be used in an enforcement rule. - Notice
Sessions with a
bitcount
higher2
should be analyzed thoroughly. See Find sessions that produce high anomaly ratings. - Line number 5:
5
sessions are reported to have a pattern bit count of1
for the number of counted sessions which applies to0.49%
of all counted sessions. This pattern does most likely do not require further investigation or an enforcement rule. - Line number 6:
986
sessions are reported to have a pattern bit count of0
for the number of counted sessions which applies to98.1%
of all counted sessions. This pattern does not show any anomaly count at all.