A data mining operator. Values in TheTargetAttribute are used as target function values to train the SVM on examples that are formed with ThePredictingAttributes. All ThePredictingAttributes must belong to TheInputConcept. TheOutputAttribute contains the predicted values.

There are some SVM-specific parameters; the table gives reasonable values to choose if nothing is known about the data or SVMs. For the KernelType, only the following values (Strings) are possible: dot, polynomial, neural, radial, anova. Dot is the linear kernel and can be taken as default.

This operator can use two different versions of the Support Vector Machine algorithm. One runs in main memory; it needs the parameter SampleSize to determine a maximum number of training examples. The other runs in the database; it is used if the optional parameter UseDB_SVM is set to the String true. When this version is used, an additional parameter TheKey is needed which gives the BaseAttribute whose column is the primary key of TheInputConcept. (TheKey can be left out only if the ColumnSet that belongs to TheInputConcept represents a table rather than a view.) The database algorithm restricts the possible kernel types to dot and radial. It can also use the parameter SampleSize.

With the parameters LossFunctionPos and LossFunctionNeg, the loss function that is used for the regression can be biased such that predicting too high is more expensive (LossFunctionPos > LossFunctionNeg) or less expensive (LossFunctionNeg > LossFunctionPos) than predicting too low. If both values are equal, no bias is used. The parameter C balances training error against generalisation quality; positive values between 0.01 and 1000 have been used successfully in the literature. Epsilon limits the allowed error an example may produce; small values under 0.5 should be used.

ParameterName ObjType Type Remarks
TheInputConcept CON IN inherited
TheTargetAttribute BA IN inherited
ThePredictingAttributes BA List IN  
KernelType V IN see explanation above
SampleSize V IN see explanation above
LossFunctionPos V IN positive real; try 1.0
LossFunctionNeg V IN positive real; try 1.0
C V IN positive real; try 1.0
Epsilon V IN positive real; try 0.1
UseDB_SVM V IN optional; one of true, false
TheKey BA IN optional
TheOutputAttribute BA OUT inherited