Page 200 - Data Science Algorithms in a Week
P. 200
184 Luis Rabelo, Edgar Gutierrez, Sayli Bhide et al.
Correlation
High correlation with class attribute
Low correlation with other attributes
Another important factor is to select individual attributes and subsets of them. The
direction of the search (e.g., best first, forward selection) is an important decision. In
addition, the selected approach was the one of model of models for the RCC problem. A
very important issue is to look for kernels, levels of interactions, and synthetic attributes.
Visualization is always important (there are many tools available for visualization).
We learned from visualizations that the relative location of the panel and the position of a
specific point in the area of a panel are important factors to differentiate the level of wear
and deterioration (Figure 11).
Attribute subset evaluators and crossvalidation were used with best-first and
backward (starting from complete set) using neural networks (Backpropagation). This
was performed to better understand the data.
Figure 11: Visualization of the average deterioration of specific panels for the three NASA shuttles.
Synthetic Attributes
Synthetic attributes are combinations of single attributes that are able to contribute to
the performance of a predictor model. The synthetic attribute creates higher dimensional
feature spaces. This higher dimensional feature spaces support a better classification
performance. For example, Cosine (X * Y ) is a synthetic variable formed by the single
2