# Linear Regression in a different way

Trying to implement linear regression for a simple dataset without using existing regression library functions or the Gradient Decent Technique

## Theory

### Regression

A method of establishing relationship between 1 or more independant variables against one dependant variable

### Linear Regression

Establishing a Linear relationship

Let’s try implementing the same:

``````from google.colab import drive

drive.mount('/content/drive')
``````
``````Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
``````
``````import numpy as np
import pandas as pd
``````

``````df = pd.read_csv('/content/drive/My Drive/slr_data.csv')
``````
``````df.head()
``````
x y
0 77 79.775152
1 21 23.177279
2 22 25.609262
3 20 17.857388
4 36 41.849864
``````df.describe()
``````
x y
count 300.000000 300.000000
mean 50.936667 51.205051
std 28.504286 29.071481
min 0.000000 -3.467884
25% 27.000000 25.676502
50% 53.000000 52.170557
75% 73.000000 74.303007
max 100.000000 105.591837
``````import matplotlib.pyplot as plt
import seaborn as sns
``````
``````plt.figure(figsize = (20,10))

sns.relplot(x = 'x', y = 'y', data = df)

plt.show()
``````
``````<Figure size 1440x720 with 0 Axes>
`````` Linear Regression is all about finding a straight line that has the least Root Mean Square Value for the given data points => the line is to be existing between the extreme Y values to make sure the line is between the points in a given dataset

``````y_max = df.y.max()
``````
``````y_min = df.y.min()
``````
``````y_max
``````
``````105.5918375
``````
``````y_min
``````
``````-3.4678837889999996
``````
``````y_mid = (y_max + y_min) / 2
``````
``````y_mid
``````
``````51.0619768555
``````
``````x2 = df.x.max()
``````
``````x1 = df.x.min()
``````
``````df.x.idxmax()
``````
``````87
``````
``````df.x.idxmin()
``````
``````55
``````
``````y2 = df.y[df.x.idxmax()]
``````
``````y1 = df.y[df.x.idxmin()]
``````
``````y2
``````
``````105.5918375
``````
``````y1
``````
``````-1.040114209
``````
``````slope_m = (y2 - y1) / (x2 - x1)
``````
``````slope_m
``````
``````1.06631951709
``````
``````def lin_equ(x):
return slope_m*(x - (df.x[df.y.idxmax()] + df.x[df.y.idxmin()]) / 2) + (df.y[df.y.idxmax()] + df.y[df.y.idxmin()]) / 2
``````
``````df.head()
``````
x y
0 77 79.775152
1 21 23.177279
2 22 25.609262
3 20 17.857388
4 36 41.849864
``````lin_equ(77)
``````
``````79.85260381693
``````
``````lin_equ(21)
``````
``````20.13871085989
``````
``````lin_equ(22)
``````
``````21.20503037698
``````
``````lin_equ(20)
``````
``````19.0723913428
``````
``````lin_equ(36)
``````
``````36.13350361624
``````
``````df['y_pred'] = pd.Series(map(lin_equ, df.x.values))
``````
``````df.head()
``````
x y y_pred
0 77 79.775152 79.852604
1 21 23.177279 20.138711
2 22 25.609262 21.205030
3 20 17.857388 19.072391
4 36 41.849864 36.133504
``````plt.figure(figsize = (20,15))

sns.relplot(x = 'x', y = 'y_pred', data = df)

plt.show()
``````
``````<Figure size 1440x1080 with 0 Axes>
`````` ``````plt.figure(figsize = (20,15))

sns.relplot(x = 'x', y = 'y', data = df)

sns.lineplot(x = 'x', y = 'y_pred', data = df)

plt.show()
``````
``````<Figure size 1440x1080 with 0 Axes>
`````` ``````neg_err_sum = sum([i*i for i in df.y_pred-df.y if i < 0])
``````
``````pos_err_sum = sum([i*i for i in df.y_pred-df.y if i >= 0])
``````
``````neg_err_sum
``````
``````1228.4737143905872
``````
``````pos_err_sum
``````
``````2396.9837025596657
``````
``````
``````