There may exist two regression lines in certain circumstances. When the variables X and Y are interchangeable with related to causal effects, one can consider X as independent variable and Y as dependent variable (or) Y as independent variable and X as dependent variable.

**WHY ARE THERE TWO REGRESSION LINES?**

There may exist two regression lines in certain circumstances.
When the variables *X* and *Y* are interchangeable with related to
causal effects, one can consider *X* as independent variable and *Y *as
dependent variable (or)* Y *as independent variable and* X *as
dependent variable. As the result,* *we have (1) **the regression line of** *Y***on** ** X** and (2)

Both are valid regression lines. But we must judicially select the
one regression equation which is suitable to the given environment.

**Note: **If,** ***X*** **only causes** ***Y*, then there is only one regression line, of** ***Y*** **on** ***X*.

In the general form of the simple linear regression equation of *Y*
on *X*

*Y*=* a *+* bX *+* e*

the constants ‘*a*’ and ‘*b*’ are generally called as
the regression coefficients.

The coefficient ‘*b*’ represents the rate of change in the
value of the mean of *Y* due to every unit change in the value of *X*.
When the range of *X* includes ‘0’, then the intercept ‘*a*’ is E(*Y*|*X*
= 0). If the range of *X* does not include ‘0’, then ‘*a*’ does not
have practical interpretation.

If (*x _{i},y_{i}*),

These estimates are determined based on the following general
assumptions:

(i) the relationship between *Y* and *X*
is linear (approximately).

(ii) the error term ‘*e*’ is a random
variable with mean zero.

(iii) the error term ‘*e*’ has constant variance.

There are other assumptions on ‘*e*’, which are not required
at this level of study.

Before going for further study, the following points are to be
kept in mind.

Both the independent and dependent variables
must be measured at the interval scale.

There must be **linear relationship**
between independent and dependent variables.

Linear Regression is very sensitive to **Outliers**** **(extreme observations). It can affect
the regression line extremely and eventually the estimated values of *Y*
too.

Based on the assumption (ii), the response variable *Y* is
also a random variable with mean

*E*(*Y*|*X*=*x*) =* a *+* bx*

In regression analysis, the main objective is finding the line of
best fit, which provides the fitted equation of *Y* on *X.*

The line of ‘best fit‘ is the line (straight line equation) which
minimizes the error in the estimation of the dependent variable *Y*, for
any specified value of the independent variable *X* from its range.

The regression equation *E(Y|X=x)* = *a* +*bx*
represents a family of straight lines for different values of the coefficients
‘*a*’ and ‘*b*’. The problem is to determine the estimates of ‘*a*’
and ‘*b*’ by minimizing the error in the estimation of *Y* so that
the line is a best fit. This necessitates to find the suitable values of the
estimates of ‘*a*’ and ‘*b*’.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

12th Statistics : Chapter 5 : Regression Analysis : Why are There Two Regression Lines? |

**Related Topics **

Privacy Policy, Terms and Conditions, DMCA Policy and Compliant

Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.