main page

MiningMart Approach Part III

 
 
 
 

Mining Mart Approach

Meta-Model  
Compiler  
Operators  
Case-Base  

Operators

As mentioned before each step is related to exactly one operator, and holds all of its input arguments. An operator performs data transformations such as, e.g., discretization, handling null values, aggregation of attributes into a new one, or collecting sequences from time-stamped data. The operators directly access the database and are capable of handling large masses of data.

Machine learning is not restricted to a data mining step, but is also applicable in preprocessing. This view offers a variety of learning tasks that are not as well investigated as is learning classifiers. For instance, an important task is to acquire events and their duration (i.e. a time interval) on the basis of time series (i.e. measurements at time points).

There are two kinds of operators, distinguished by their output on the conceptual level: those that have an output Concept (Concept Operators, Feature selection operator), and those that have an output BaseAttribute (Feature Construction Operators). All operators have parameters, such as input Concept or output BaseAttribute.

Concept operators

All Concept operators take an input Concept and create at least one new ColumnSet which they attach to the output Concept. The output Concept must have all its Features attached to it before the operator is compiled. All Concept operators have the two parameters TheInputConcept and TheOutputConcept, which are marked as inherited in the following parameter descriptions.

Joining concepts
MultiRelationalFeatureConstruction
JoinByKey
UnionByKey

Row Selection
RowSelectionByQuery
RowSelectionByRandomSampling
DeleteRecordsWithMissingValues

Segmentation
SegmentationStratified
SegmentationByPartitioning
SegmentationWithKMean
UnSegment

Time series
Windowing
SimpleMovingFunction
WeightedMovingFunction
ExponentialMovingFunction
SignalToSymbolProcessing

Misc
SpecifiedStatistics
Apriori

Feature selection operators

Feature selection operators are also concept operators in that their output is a Concept, but they are listed in their own section since they have some common special properties. All of them (except FeatureSelectionByAttributes) use external algorithms to determine which features are taken over to the output concept. This means that at the time of designing an operating chain, it is not known which features will be selected.

FeatureSelectionByAttributes
StochasticFeatureSelection
GeneticFeatureSelection
SGFeatureSelection

Feature construction operators

All operators in this section are loopable. For loops, TheInputConcept remains the same while TheTargetAttribute, TheOutputAttribute and further operator-specific parameters change from loop to loop (loop numbers start with 1).

Missing Value
AssignAverageValue
AssignModalValue
AssignMedianValue
AssignDefaultValue
AssignStochasticValue
MissingValuesWithDecisionTree
MissingValuesWithRegressionSVM
MissingValueWithDecisionRules
MissingValueWithDecisionTree
AssignPredictedValueCategorial

Scaling
LinearScaling
LogScaling

Data Mining
SupportVectorMachineForRegression
SupportVectorMachineForClassification
ComputeSVMError
PredictionWithDecisionRules
PredictionWithDecisionTree

Misc
GenericFeatureConstruction

Discretization
TimeIntervalManualDiscretization
NumericIntervalManualDiscretization

Next...