Skip to main content

DBotTrainClustering

This Script is part of the Base Pack.#

Supported versions

Supported Cortex XSOAR versions: 6.2.0 and later.

This script helps organizes and groups incidents based on their similarities using clustering algorithms. Clustering is a technique used to group data points (in this case, incidents) that are similar to each other into clusters. Used to automatically categorize a large number of incidents into meaningful groups.

Script Data#


NameDescription
Script Typepython3
Tagsml
Cortex XSOAR Version6.2.0

Inputs#


Argument NameDescription
fieldsForClusteringComma-separated list of incident fields to take into account when training the clustering.
fieldForClusterNameIncident field that represents the family name for each cluster created. The model determines how many incidents in the cluster have the same value in the fieldForClusterName field. The largest numbers of incidents with the same value determine the cluster name.
fromDateThe start date by which to filter incidents. Date format will be the same as in the incidents query page, for example, "3 days ago", ""2019-01-01T00:00:00 +0200").
toDateThe end date by which to filter incidents. Date format will be the same as in the incidents query page, for example, "3 days ago", ""2019-01-01T00:00:00 +0200").
limitThe maximum number of incidents to query.
queryArgument for the query.
minNumberofIncidentPerClusterMinimum number of incidents a cluster should contain for it to be retained.
modelNameName of the model.
storeModelWhether to store the model in the system.
minHomogeneityClusterKeep samples in the cluster when the family ratio is above this number. Will be effective only if fieldForClusterName is given.
overrideExistingModelWhether to override the existing model if a model with the same name exists. Default is "False".
typeType of incident to train the model on. If empty, will consider all types.
maxRatioOfMissingValueIf a field has a higher missing value than this ratio it will be removed.
debugWhether to return more information about the clustering. Default is "False".
forceRetrainWhether to re-train the model in any cases. Default is "False".
modelExpirationPeriod of time (in hours) before retraining the model. Default is "24".
modelHiddenWhether to hide the model in the ML page.
searchQuerySearch query input from the dashboard.
fieldsToDisplayComma-separated list of additional incident fields to display, but which will not be taken into account when computing similarity.
numberOfFeaturesPerFieldNumber of features per field.
analyzerWhether the feature should be made of word or character n-grams. Possible values: "char" and "word".

Outputs#


PathDescriptionType
DBotTrainClusteringThe clustering data in JSON format.String