A regression is to find a best fit line to describe the form of the relationship. The name of the line is called the least squares regression line and it shows the relationship. The line is given in the form of an equation:
y = mx + b for a straight line regression
y = ax2 + bx + c for a quadratic regression
y = abx + c for an exponential regression

Usually, the regression line will be created using a set of data. Subsequently, the equation of the line found through the regression process will be used to make a prediction of an output (y) value for a proposed value of the input (x).

R (Correlation Coefficient )
Measures the strength of the linear relationship. It doesn’t show a curved relationship even if it’s a strong one. As R is closer to 0 a straight line is a poorer description of the data [bad fit], but when its close to either -1 or 1, it’s a strong fit. Below are some examples of correlation coefficients. The one all the way in the lower right corner when r = -.99 has the best fit of all 6 graphs. With such a high r, it shows how well the points actually fit to the line. The one in the upper left corner on the other hand has the worst fit since r = 0, it demonstrates no fit.

R^2 is known as the coefficient of determination which is the proportion of the y values explained by the least squares regression line. A high R^2 is a good linear fit.

*Linear

The equation is, the slope = R * standard deviation of y / standard deviation of x , the means of x and y are points on the least squares regression line.

From the regression line, you can calculate the residuals. A residual is the predicted value from the regression line.
A residual plot is a SCATTER PLOT of all the residuals.

How to find Regression line

1. Use a calculator and type your 2 lists of equal length.
2. Go to the CALC section under STATS and choose 4: LinReg (ax+b) , press enter
3. Type your 2 lists, separated by a comma and then VARS, Y-VARS, Function and then Y1
4. your R and R^2 will appear and your least squares regression is what y=

This regression line is a good fit because it represents the linear relationship between x and y .

This scatterplot would not have a good fit to a linear regression line because the points demonstrate a nonlinear relationship.

To determine if the line is good or not
- R
- Residual plot
If the residual plot shows no systematic pattern, it is good.

This residual plot shows the residuals (distance of observed points from the predicted point on the regression line)

*Quadratic

if the graph is quadratic/ exponential, have to change it to make it have a good linear fit

power functions, when x is raised to a power, y= 4x^3
the points that have a strong linear fit for a power function would be (log(x), log(y)); these are the points for a strong linear fit.

*Exponential

Exponential function, y = 3^x
A strong linear fit for exponential = (x, log (y))

How to Find the Equation Using the Calculator
1. Use a calculator and type your 2 lists of equal length.
2. Go to the CALC section under STATS and choose 0: ExpReg, press enter
-Use this function on the calculator because if you were to use LinReg, then the line would not be a good representation of the data.
3. Type your 2 lists, separated by a comma and then VARS, Y-VARS, Function and then Y1
4. Reexpress the equation to make it linear using logarithms

Example1 :

wife

husband

22

25

32

25

50

51

25

25

33

38

27

30

45

60

47

54

30

31

1. Find the equation of the Least Squares Regression Line, correlation coefficient, and coefficient of determination.
2. Using the Least Squares Regression Line, what is the predicted age of the husband whose wife is 50? What is the value of the residual?

Example 2:

Time

Difference in temp

10

68

20

36

30

20

40

10

50

6

60

4

1. What is the equation of the exponential graph?
2. Reexpress the equation as linear fit using logarithms.
3. Use the equation to predict the difference in temperature after 45 minutes.

Answers:
Example 1
1. LSRL: y = 1.244X - 5.317 (equation that best matches the data) R = .921 (correlation coeff), R^2 = .849
2. y = (1.244)(50) - 5.317 = 56.883 years old
Residual = O - P
Residual = 51 - 56.883
Residual = -5.883

Example 2
1. LSRL: y = (114.055)(.944)X
2. ln(y) = -0.0576X + 4.737
3. ln(y) = (-0.0576)(45) + 4.737
ln(y) = 2.145
y = e^2.145
y = 8.542

Harder Questions

A researcher uses a regression equation to predict home heating bills (dollar cost), based on home size (square feet). The correlation between predicted bills and home size is 0.70. What is the correct interpretation of this finding?
(A) 70% of the variability in home heating bills can be explained by home size.
(B) 49% of the variability in home heating bills can be explained by home size.
(C) For each added square foot of home size, heating bills increased by 70 cents.
(D) For each added square foot of home size, heating bills increased by 49 cents.
Answer: b- r = .7 therefore r^2 = .49 r^2 explains the proportion of values the least squares regression line represents.

## Regression

A regression is to find a best fit line to describe the form of the relationship. The name of the line is called the least squares regression line and it shows the relationship. The line is given in the form of an equation:

y = mx + b for a straight line regression

y = ax2 + bx + c for a quadratic regression

y = abx + c for an exponential regression

Usually, the regression line will be created using a set of data. Subsequently, the equation of the line found through the regression process will be used to make a prediction of an output (y) value for a proposed value of the input (x).

R (Correlation Coefficient )Measures the strength of the linear relationship. It doesn’t show a curved relationship even if it’s a strong one. As R is closer to 0 a straight line is a poorer description of the data [bad fit], but when its close to either -1 or 1, it’s a strong fit. Below are some examples of correlation coefficients. The one all the way in the lower right corner when r = -.99 has the best fit of all 6 graphs. With such a high r, it shows how well the points actually fit to the line. The one in the upper left corner on the other hand has the worst fit since r = 0, it demonstrates no fit.

R^2 is known as the coefficient of determination which is the proportion of the y values explained by the least squares regression line. A high R^2 is a good linear fit.

## *Linear

The equation is, the slope = R * standard deviation of y / standard deviation of x , the means of x and y are points on the least squares regression line.From the regression line, you can calculate the residuals. A residual is the predicted value from the regression line.

A residual plot is a SCATTER PLOT of all the residuals.

1. Use a calculator and type your 2 lists of equal length.How to find Regression line2. Go to the CALC section under STATS and choose 4: LinReg (ax+b) , press enter

3. Type your 2 lists, separated by a comma and then VARS, Y-VARS, Function and then Y1

4. your R and R^2 will appear and your least squares regression is what y=

To determine if the line is good or not- R

- Residual plot

If the residual plot shows no systematic pattern, it is good.

if the graph is quadratic/ exponential, have to change it to make it have a good linear fit*Quadraticpower functions, when x is raised to a power, y= 4x^3

the points that have a strong linear fit for a power function would be (log(x), log(y)); these are the points for a strong linear fit.

Exponential function, y = 3^x*ExponentialA strong linear fit for exponential = (x, log (y))

How to Find the Equation Using the Calculator1. Use a calculator and type your 2 lists of equal length.

2. Go to the CALC section under STATS and choose 0: ExpReg, press enter

-Use this function on the calculator because if you were to use LinReg, then the line would not be a good representation of the data.

3. Type your 2 lists, separated by a comma and then VARS, Y-VARS, Function and then Y1

4. Reexpress the equation to make it linear using logarithms

## Example1 :

1. Find the equation of the Least Squares Regression Line, correlation coefficient, and coefficient of determination.

2. Using the Least Squares Regression Line, what is the predicted age of the husband whose wife is 50? What is the value of the residual?

## Example 2:

2. Reexpress the equation as linear fit using logarithms.

3. Use the equation to predict the difference in temperature after 45 minutes.

Answers:

Example 1

1. LSRL: y = 1.244X - 5.317 (equation that best matches the data) R = .921 (correlation coeff), R^2 = .849

2. y = (1.244)(50) - 5.317 = 56.883 years old

Residual = O - P

Residual = 51 - 56.883

Residual = -5.883

Example 2

1. LSRL: y = (114.055)(.944)X

2. ln(y) = -0.0576X + 4.737

3. ln(y) = (-0.0576)(45) + 4.737

ln(y) = 2.145

y = e^2.145

y = 8.542

Harder QuestionsA researcher uses a regression equation to predict home heating bills (dollar cost), based on home size (square feet). The correlation between predicted bills and home size is 0.70. What is the correct interpretation of this finding?

(A) 70% of the variability in home heating bills can be explained by home size.

(B) 49% of the variability in home heating bills can be explained by home size.

(C) For each added square foot of home size, heating bills increased by 70 cents.

(D) For each added square foot of home size, heating bills increased by 49 cents.

Answer: b- r = .7 therefore r^2 = .49 r^2 explains the proportion of values the least squares regression line represents.

http://www.devexpress.com/Help/?document=XtraCharts/CustomDocument6231.htm&levelup=true

http://en.wikipedia.org/wiki/Linear_regression#Applications_of_linear_regressionhttp://www.stat.tamu.edu/~pkohli/303s mmer/ch10.pdfhttp://www.stat.tamu.edu/~pkohli/303summer/ch10.pdfhttp://stattrek.com/AP-Statistics-1/Regression.aspx?Tutorial=Stat