# Tensorflow implements univariate linear regression

** See Example: Univariate Linear Regression.ipynb **

# Tensorflow implements multivariate linear regression

** In the previous section, we used Tensorflow to build our first complete model - univariate linear regression. In this section, we will build a multivariate linear model to implement regression on multidimensional data. In addition, we will also introduce how to analyze the training process using TensorBoard, the visualization tool that comes with Tensorflow. **

** Import related libraries **

```%matplotlib notebook

import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.contrib.learn as skflow
from sklearn.utils import shuffle
import numpy as np
import pandas as pd
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
print(tf.__version__)
print(tf.test.is_gpu_available())
```
```1.12.0
False
```

** Dataset Introduction **

This dataset contains multiple factors related to Boston housing prices:

**CRIM**: Urban crime rate per capita

** ZN **: Proportion of residential land over 25,000 sq.ft.

** INDUS ** : Proportion of urban non-retail land

**CHAS**: Charles River null variable (1 if the boundary is a river; otherwise, 0)

**NOX**: Nitric oxide concentration

**RM**: Average number of rooms in a residence

**AGE **: Proportion of owner-occupied houses built before 1940

**DIS**: Weighted distance to Boston's 5 central areas

**TAX**: Full-value property tax rate per \$10,000

**PTRATIO**: urban teacher-student ratio

**LSTAT**: Proportion of low-status people in the population

**MEDV**: Average home price in thousands of dollars

** The dataset is stored in CSV format, which can be read and formatted by the Pandas library **

** Pandas library ** can help us quickly read data files of regular size

Ability to read CVS files, text files, MS Excel, SQL databases and HDF5 format files for scientific purposes

Automatic conversion to Numpy multidimensional array

** Import data via Pandas **

```df = pd.read_csv("data/boston.csv", header=0)
print (df.describe())
```
```             CRIM         ZN       INDUS         CHAS         NOX          RM  \
count  506.000000  506.000000  506.000000  506.000000  506.000000  506.000000
mean     3.613524   11.363636   11.136779    0.069170    0.554695    6.284634
std      8.601545   23.322453    6.860353    0.253994    0.115878    0.702617
min      0.006320    0.000000    0.460000    0.000000    0.385000    3.561000
25%      0.082045    0.000000    5.190000    0.000000    0.449000    5.885500
50%      0.256510    0.000000    9.690000    0.000000    0.538000    6.208500
75%      3.677082   12.500000   18.100000    0.000000    0.624000    6.623500
max     88.976200  100.000000   27.740000    1.000000    0.871000    8.780000

AGE         DIS         RAD         TAX     PTRATIO       LSTAT  \
count  506.000000  506.000000  506.000000  506.000000  506.000000  506.000000
mean    68.574901    3.795043    9.549407  408.237154   18.455534   12.653063
std     28.148861    2.105710    8.707259  168.537116    2.164946    7.141062
min      2.900000    1.129600    1.000000  187.000000   12.600000    1.730000
25%     45.025000    2.100175    4.000000  279.000000   17.400000    6.950000
50%     77.500000    3.207450    5.000000  330.000000   19.050000   11.360000
75%     94.075000    5.188425   24.000000  666.000000   20.200000   16.955000
max    100.000000   12.126500   24.000000  711.000000   22.000000   37.970000

MEDV
count  506.000000
mean    22.532806
std      9.197104
min      5.000000
25%     17.025000
50%     21.200000
75%     25.000000
max     50.000000
```

** Load the data required for this example **

```df = np.array(df)

for i in range(12):
df[:,i] = (df[:,i]-df[:,i].min())/(df[:,i].max()-df[:,i].min())
#x_data = df[['CRIM', 'DIS', 'LSTAT']].values.astype(float) #Select 3 of the more important influencing factors
x_data = df[:,:12]
#y_data = df['MEDV'].values.astype(float) #get y
y_data = df[:,12]
```

## Build the model

** Define placeholders for \(x\) and \(y\) **

```x = tf.placeholder(tf.float32, [None,12], name = "x") # 3 influencing factors
y = tf.placeholder(tf.float32, [None,1], name = "y")
```

** create variable **

```with tf.name_scope("Model"):
w = tf.Variable(tf.random_normal([12,1], stddev=0.01), name="w0")
b = tf.Variable(1., name="b0")
def model(x, w, b):
return tf.matmul(x, w) + b

pred= model(x, w, b)
```

You can see that both b0 and w0 are under the namespace Model **

** Supplementary introduction - namespace name_scope **

There are often thousands of nodes in Tensorflow, and it is difficult to display them all at once during the visualization process. Therefore, name_scope can be used to divide the scope of variables. In visualization, this represents a level in the calculation graph.

• name_scope** does not affect **names of variables created with get_variable()
• name_scope** affects ** variables created with Variable() and op_name

The following examples illustrate:

## Train the model

** Set training parameters **

```train_epochs = 500 # number of iterations
learning_rate = 0.01 #learning rate
```

** Define the mean squared loss function **

```with tf.name_scope("LossFunction"):
loss_function = tf.reduce_mean(tf.pow(y-pred, 2)) #Mean Squared Error MSE
```

Similarly, we can view the operations (op) under the namespace LossFunction through TensorBoard, including: mean, pow and sub (subtraction), and we define loss_function = tf.reduce_mean(tf.pow(y-pred, 2)) consistent.

** Select optimizer **

```optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)
```

** Declare Session **

```sess = tf.Session()
init = tf.global_variables_initializer()
```

** Generate graph protocol file **

```tf.train.write_graph(sess.graph, 'log2/boston','graph.pbtxt')
```
```'log2/boston\\graph.pbtxt'
```
```loss_op = tf.summary.scalar("loss", loss_function)
merged = tf.summary.merge_all()
```

** Supplementary introduction - TensorBoard

TensorBoard is a visualization tool that comes with Tensorflow.

Currently 7 visualization objects are supported: SCALARS, IMAGES,AUDIO,GRAPHS,DISTRIBUTIONS,HISTOGRAMS,EMBEDDINGS**.

During the training process, visualization can be achieved by recording structured data, and then running a local server listening on port 6006.

First specify the data to be recorded, and then open the TensorBoard panel with the following command:

** tensorboard --logdir=/your/log/path **

After entering the above command, it will display:

At this point, we can open the ** in the browser http://192.168.2.102:6006 to view the various functions of the panel.

Note: The specific IP address will vary from machine to machine, just check the IP displayed after "You can navigate to" in the command window.

For example, in this example, the value of loss_function is recorded through tf.summary.scalar **, so the following visualization results can be viewed in the SCALARS panel of TensorBoard:

** start session **

```sess.run(init)
```

** Create a file writer (FileWriter) for digests **

```writer = tf.summary.FileWriter('log/boston', sess.graph)
```

The path '/path/to/logs' specified in tf.summary.FileWriter('/path/to/logs', sess.graph) is the value of the parameter logdir when running the tensorboard command

** Iterative training **

```loss_list = []
for epoch in range (train_epochs):
loss_sum=0.0
for xs, ys in zip(x_data, y_data):
z1 = xs.reshape(1,12)
z2 = ys.reshape(1,1)
_,loss = sess.run([optimizer,loss_function], feed_dict={x: z1, y: z2})
summary_str = sess.run(loss_op, feed_dict={x: z1, y: z2})
#lossv+=sess.run(loss_function, feed_dict={x: z1, y: z2})/506.00
loss_sum = loss_sum + loss
# loss_list.append(loss)
x_data, y_data = shuffle(x_data, y_data)
print (loss_sum)
b0temp=b.eval(session=sess)
w0temp=w.eval(session=sess)
loss_average = loss_sum/len(y_data)
loss_list.append(loss_average)
print("epoch=", epoch+1,"loss=",loss_average,"b=", b0temp,"w=", w0temp )

```
```149248.76506266068
epoch= 1 loss= 294.95803372067326 b= 3.94445 w= [[0.9625533]
[1.7546748]
[2.3211656]
[1.1788806]
[2.3749657]
[2.9518034]
[2.4899516]
[2.6423771]
[2.4740942]
[2.3962796]
[2.405994 ]
[1.7411354]]
79301.29809246541
epoch= 2 loss= 156.72193298906208 b= 5.731459 w= [[-0.10316267]
[ 3.2055295 ]
[ 2.7016158 ]
[ 1.8831778 ]
[ 2.8047607 ]
[ 4.819791  ]
[ 3.5140483 ]
[ 4.73509   ]
[ 2.2039707 ]
[ 2.4170697 ]
[ 3.5754337 ]
[ 1.6520783 ]]
59155.64258796818
epoch= 3 loss= 116.90838456120193 b= 6.8220057 w= [[-1.4175   ]
[ 4.436672 ]
[ 2.2779799]
[ 2.4264417]
[ 2.4682086]
[ 6.168085 ]
[ 3.792487 ]
[ 6.3616514]
[ 1.288084 ]
[ 1.6558697]
[ 3.8651628]
[ 0.6330488]]
47703.83335633681
```

....................

```print("y=",w0temp[0], "x1+",w0temp[1], "x2+",w0temp[2], "x3+", [b0temp])
print("y=",w0temp[0], "CRIM+", w0temp[1], 'DIS+', w0temp[2], "LSTAT+", [b0temp])
```
```y= [-10.775753] x1+ [4.629923] x2+ [0.36049515] x3+ [30.468655]
y= [-10.775753] CRIM+ [4.629923] DIS+ [0.36049515] LSTAT+ [30.468655]
```
```plt.plot(loss_list)
```
```<IPython.core.display.Javascript object>
```
`[<matplotlib.lines.Line2D at 0x1567f8cda58>]`

Posted by laurus on Wed, 01 Jun 2022 01:04:34 +0530