Page 202 - Data Science Algorithms in a Week
P. 202

Glossary of Algorithms and Methods in Data Science


                      Time series analysis: The analysis of data dependent on time; it mainly includes
                      the analysis of trend and seasonality.
                      Support vector machines: A classification algorithm that finds the hyperplane
                      that divides the training data into the given classes. This division by the
                      hyperplane is then used to classify the data further.
                      Principal component analysis: The preprocessing of the individual components
                      of the given data in order to achieve better accuracy, for example, rescaling of the
                      variables in the input vector depending on how much impact they have on the
                      end result.
                      Text mining: The search and extraction of text and its possible conversion to
                      numerical data used for data analysis.
                      Neural networks: A machine learning algorithm consisting of a network of
                      simple classifiers making decisions based on the input or the results of the other
                      classifiers in the network.
                      Deep learning: The ability of a neural network to improve its learning process.
                      A priori association rules: The rules that can be observed in the training data
                      and, based on which, a classification of the future data can be made.
                      PageRank: A search algorithm that assigns the greatest relevance to the search
                      result that has the greatest number of incoming web links from the most relevant
                      search results on a given search term. In mathematical terms, PageRank
                      calculates a certain eigenvector representing these measures of relevance.
                      Ensemble learning: A method of learning where different learning algorithms
                      are used to make a final conclusion.
                      Bagging: A method of classifying a data item by the majority vote of the
                      classifiers trained on the random subsets of the training data.
                      Genetic algorithms: Machine learning algorithms inspired by the genetic
                      processes, for example, an evolution where classifiers with the best accuracy are
                      trained further.
                      Inductive inference: A machine learning method learning the rules that
                      produced the actual data.
                      Bayesian networks: A graph model representing random variables with their
                      conditional dependencies.
                      Singular value decomposition: A factorization of a matrix, a generalization of
                      eigen decomposition, used in least squares methods.
                      Boosting: A machine learning meta algorithm decreasing the variance in the
                      estimation by making a prediction based on the ensembles of the classifiers.
                      Expectation maximization: An iterative method to search the parameters in the
                      model that maximize the accuracy of the prediction of the model.



                                                    [ 190 ]
   197   198   199   200   201   202   203   204   205