Wednesday, February 20, 2019

Using Tensorflow with Keras - Introduction

Tensorflow + Keras


Yeah i have heard of em... Yay!! Am a techie!!


Well good for you.. we are not going to talk in detail about what Tensorflow and Keras are... 
Although am sure, many of us at least want a crash course! Therefore, as a super quick intro.. 

Tensorflow: An open source machine learning framework backed by google (kinda sdk for machine learning). They have done the mathematical implementations so you don't have to re-invent the wheels.

Keras: Is also an open source library (kinda sdk) BUT it is an interface. What it does, is to further simplify the frameworks like Tensorflow so that even people like me can code for Machine learning!

For the sake of simplicity, will try to stick with imitating linear regression that we used in our previous example.. by this example you will see that using Keras with tensorflow simplifies our lives so much more!!

Lets take a sample data-set.

Nice huh!!! The data set above contains a randomized sample of information in X & Y columns. The goal of our app is to build a simplest neural network that can iterate through and help us predict values for given input.

Now, since we are going to use a linear regression, obviously the outputs shall always be somewhat a straight line.

Alright, lets begin! Will try to keep the codes in code sections so its easy for us to iterate through.

Lets prepare our dataset (i.e. Pre-processing of data)

#We will use Pandas to read the csv file
import pandas as pd

file1 = "../data/input_rand1.csv"

# Incase you have mode than 1 csv, you may want to use this piece of code to
# combine them
all_files = [file1]
dataset = pd.concat((pd.read_csv(f,delimiter=',') for f in all_files))

# We don't need empty values
dataset = dataset.dropna(how="any", axis=0)

#replace spaces with _ for headers
dataset.rename(columns=lambda x: x.replace(' ', '_'), inplace=True)

Lets divide the data into training set and testing set

train = dataset.sample(frac=0.8,random_state=200)
test = dataset.drop(train.index)

Once the data set has been divided, lets identify our features and labels (in this case, its X -> features, and Y-> Label)

X_train = train.drop("Y", axis=1)
Y_train = train["Y"]
# Also for Test set
X_test = test.drop("Y", axis=1)
Y_test = test["Y"]

So now are data is ready lets get the bigshots in the game

from tensorflow import keras
# We will like to see how the training went on tensorboard too!
from tensorflow.keras.callbacks import TensorBoard
import time

Initializing tensorboard and providing a location where it may want to store its files

NAME = "Linear_{}".format(int(time.time()))
tb = TensorBoard(log_dir='logs/{}'.format(NAME))

Okay.. now comes the most complicated part! We will need to build the model...
If you can recall, in our previous code, we had to create input_fn and all other fancy stuff so that we can convert our datasets into tensors and then pass it to the estimator.

Well in case of Keras with Tensorflow, you may not need that ..

So building model is simply.. 

model = keras.Sequential([
keras.layers.Dense(10, input_shape=[len(list(X_test))]),
keras.layers.Dense(1)
])

The above code basically mean that you are creating a model with 2 layers.. 1st layer has 10 nodes with input shape of number of columns in features.

And since we want only 1 output as result for every row of input that we give, we have 1 node as output layer.

model.compile(loss='mse',
optimizer='adam',
metrics=['mae', 'mse'])

For this example we are going to use adam optimizer, in case you want more details on Mr. Adam, go here

now let the training begin..!!!!
As mentioned no need to change pandas dataset to anything special, you can pass them as is to the keras system.

model.fit(X_train, Y_train, epochs=1000, validation_split = 0.2, callbacks=[tb])

Now the only thing remaining to do is test it out for predictions.
For this example am just going to run all the inputs (i.e. X) again through the predictor and see what will be the result (even though i know it should be a straight line somewhere in the middle of the graph.. still, its fun!)

input_dict = train
input_dict_x = input_dict.drop("Y", axis=1)
input_dict_y = input_dict["Y"]
predict_results = model.predict(input_dict_x)

Lets check our updated graph!!

Behold The Graph!


As you can see the graph has a straight line right in the middle, which we anyways expected..

In case you want to just download the code and run it.. feel free to swing by on my github : https://github.com/abhinavasr/machinelearning  I'll try to share most of my learnings there!

Till next time!!