Chapter: Data Warehousing and Data Mining : Association Rule Mining and Classification

Prediction

model the relationship between one or more independent or predictor variables and a dependent or response variable Regression analysis

Prediction

 

(Numerical) prediction is similar to classification

·         construct a model

·        use model to predict continuous or ordered value for a given input

Prediction is different from classification

·        Classification refers to predict categorical class label

·        Prediction models continuous-valued functions

Major method for prediction: regression

 

·        model the relationship between one or more independent or predictor variables and a dependent or response variable 

Regression analysis

 

·        Linear and multiple regression

·        Non-linear regression

·        Other regression methods: generalized linear model, Poisson regression, log-linear models, regression trees

 

Linear Regression

Linear regression: involves a response variable y and a single predictor variable x

 

y = w0 + w1 x

 

where w0 (y-intercept) and w1 (slope) are regression coefficients

 

Method of least squares: estimates the best-fitting straight line

·        Multiple linear regression: involves more than one predictor variable

 

·        Training data is of the form (X1, y1), (X2, y2),…, (X|D|, y|D|) o

·         Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2

 

·        Solvable by extension of least square method or using SAS, S-Plus

·        Many nonlinear functions can be transformed into the above

 

Nonlinear Regression

 

o     Some nonlinear models can be modeled by a polynomial function

 

o     A polynomial regression model can be transformed into linear regression model. For example,

o     y = w0 + w1 x + w2 x2 + w3 x3

 

o     convertible to linear with new variables: x2 = x2, x3= x3

 

o     y = w0 + w1 x + w2 x2 + w3 x3

 

o     Other functions, such as power function, can also be transformed to linear model

 

o     Some models are intractable nonlinear (e.g., sum of exponential terms)

 

o   possible to obtain least square estimates through extensive calculation on more complex formulae


Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail
Data Warehousing and Data Mining : Association Rule Mining and Classification : Prediction |


Privacy Policy, Terms and Conditions, DMCA Policy and Compliant

Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.