Part 1 – Preconfiguring an Anomaly Shield application
To protect back-end groups and applications using Airlock Anomaly Shield, one or more Anomaly Shield applications must be configured. After initial setup, the machine learning models need to be trained with production traffic data to effectively detect anomalies and suspicious behavior.
Best practice is to start by configuring one Anomaly Shield application per mapping and observing traffic patterns. This approach helps to isolate behaviors and ensures better model accuracy:
- If a mapping handles several thousand sessions per week, continue using separate Anomaly Shield applications per mapping.
- If a mapping handles a few hundred sessions per day or week, you may consider combining similar mappings into a shared Anomaly Shield application to accumulate sufficient data.
See also the supplementary article Assigning mappings to Anomaly Shield applications.
Preconfigure an Anomaly Shield application and assign it to a mapping
- Go to:
Application Firewall >> Anomaly Shield >> tab Applications - Select the ON radio button to activate Airlock Anomaly Shield.
- Click the + button to add a new Anomaly Shield Application.
- The Anomaly Shield Application page opens up.
- Set an Application Name.
- The new Anomaly Shield application must be assigned to a mapping so that the Anomaly Shield application processes traffic on the mapping.
Go to:
Application Firewall >> Reverse Proxy - Assign the Anomaly Shield application to each mapping that should be included in the same Anomaly Shield application. Select the corresponding Anomaly Shield application on the Basic tab of the mapping detail page.
- Proceed with enabling training data collection.
Enable training data collection
Collecting realistic training data is required as input for the Anomaly Shield machine learning model. As a rule, a minimum of 3000 sessions—including atypical and suspicious ones—provides a solid foundation for training the machine learning model. The most effective way to achieve this is by enabling data collection and using the automatic retrain and enforce feature.
Note the following when collecting training data:
- Collect realistic production data. If required, filter out internal vulnerability scans using Traffic Matchers as Training Data Collection Exclusion.
- Collecting the full range of sessions and traffic behavior that may occur in typical calendar months is essential. We recommend using the automatic retrain and enforce option that collects 5–6 weeks of continuous session data, covering weekdays with working times, weekends, and day/night traffic in between training and model enforcement runs.
- Anomaly Shield works with session data but does not require authenticated sessions. Continue collecting session data until at least several thousand sessions have been saved.
Automatic retraining options:
- For continuous model improvement, we recommend choosing the option Retrain and enforce. This will automatically choose a period of typical training data within the last 3 months (see scheduled Next training date). This option is configured in the following.
- The Retrain only option can be used in critical environments when model enforcement should not be performed automatically. In this case, the Next training date can be used to schedule the next manual enforcement date.
- We do not recommend turning off Automatic retraining. You may turn off the option if you need complete control over which training data is used for the Anomaly Shield model or if continuous improvement is undesirable.
The training data are linked to the application name. Note that changing the Anomaly Shield application name, therefore, requires collecting new training data!
- Go to:
Application Firewall >> Anomaly Shield - In the application list, click the button to manage the machine learning model of the application. The Anomaly Shield Model Management page opens up.
- In section Training Task, enable Retrain and enforce.
- Go back to the Applications page. The icon appears in the column Enforced Model, indicating that automatic retraining and enforcement is activated.
- Enable Data Collection in the second column by clicking the dot (enabled = green).
- Proceed with Part 2 – Training and model enforcement to see when the next automatic retraining is scheduled.
Further information and links
Internal links: