keras: Deep Learning in R (Article)
- Today’s tutorial will give you a short introduction to deep learning in R with Keras with the package:
Do you want to know more about the original Keras or key concepts in deep learning such as perceptrons and Multi-Layer Perceptrons (MLPs)?
- Interface to the FCNN library that allows user-extensible ANNs Implementation of basic machine learning methods with many layers (deep learning), including dA (Denoising Autoencoder), SdA (Stacked Denoising Autoencoder), RBM (Restricted Boltzmann machine) and DBN (Deep Belief Nets) Package to streamline the training, fine-tuning and predicting processes for deep learning based on and Package that brings flexible and efficient GPU computing and state-of-art deep learning to R Tip: for a comparison of deep learning packages in R, read this blog post.
- Both packages provide an R interface to the Python deep learning package Keras, of which you might have already heard or maybe you have even worked with it!
- You see, getting started with Keras is one of the easiest ways to get familiar with deep learning in Python and that also explains why the and packages provide an interface for this fantastic package for R users.
- In essence, you won’t find too many differences between the R packages and the original Python package, mostly because the function names are almost all the same; The only differences that you notice are mostly in the programming languages themselves (variable assignment, library loading, …), but the most important thing to notice lies in the fact of how much of the original functionality has been incorporated in the R package.
In this tutorial to deep learning in R with RStudio’s keras package, you’ll learn how to build a Multi-Layer Perceptron (MLP).
@DataCamp: keras: Deep Learning in R tutorial – harness the power of deep learning with R!
As you know by now, machine learning is a subfield in Computer Science (CS). Deep learning, then, is a subfield of machine learning that is a set of algorithms that is inspired by the structure and function of the brain and which is usually called Artificial Neural Networks (ANN). Deep learning is one of the hottest trends in machine learning at the moment, and there are many problems where deep learning shines, such as robotics, image recognition and Artificial Intelligence (AI).
Do you want to know more about the original Keras or key concepts in deep learning such as perceptrons and Multi-Layer Perceptrons (MLPs)? Consider or taking DataCamp’s Deep Learning in Python course or doing the Keras Tutorial: Deep Learning in Python.
Tip: find our Keras cheat sheet here.
With the rise in popularity of deep learning, CRAN has been enriched with more R deep learning packages; Below you can see an overview of these packages, taken from the Machine Learning and Statistical Learning CRAN task view. The “Percentile” column indicates the percentile as found on RDocumentation:
Both packages provide an R interface to the Python deep learning package Keras, of which you might have already heard or maybe you have even worked with it! For those of you who don’t know what the Keras package has to offer to Python users, it’s “a high-level neural networks API, written in Python and capable of running on top of either TensorFlow, Microsoft Cognitive Toolkit (CNTK) or Theano”.
Now that you have gathered some background, it’s time to get started with Keras in R for real. As you will have read in the introduction of this tutorial, you’ll first go over the setup of you workspace. Then, you’ll load in some data and after a short data exploration and preprocessing step, you will be able to start constructing your MLP!
Let’s get on with it!
As always, the first step to getting started with any package is to set up your workspace: install and load in the library into RStudio or whichever environment you’re working in.
No worries, for this tutorial, the package will be loaded in for you!
When you have done this, you’re good to go! That’s fast, right?
Tip: for more information on the installation process, check out the package website.
package, you can load your own dataset from, for example, CSV files, or you can make some dummy data.
Whichever situation you’re in, you’ll see that you’ll be able to quickly get started with the package. This section will quickly go over the three options and explain how you can load (or create) in the data that you need to get started!
For those of you who don’t have the biology knowledge that is needed to work with this data, here’s some background information: all flowers contain a sepal and a petal. The sepal encloses the petals and is typically green and leaf-like, while the petals are typically colored leaves. For the iris flowers, this is just a little bit different, as you can see in the following picture:
function needs to be a matrix or array.
Some things to keep in mind about these two data structures that were just mentioned: – Matrices and arrays don’t have column names; – Matrices are two-dimensional objects of a single data type; – Arrays are multi-dimensional objects of a single data type;
Tip: Check out this video if you want a recap of the data structures in R!
data frame in the previous section. Knowing this and taking into account that you’ll need to work towards a two- or multi-dimensional object of a single data type, you should already prepare to do some preprocessing before you start building your neural network!
function to convert the names of the species, that is, “setosa, versicolor”, and “virginica”, to the numeric 1, 2, and 3.
Now take a closer look at the result of the plotting function:
function, which will give you the overall correlation between all attributes that are included in the data set:
argument to indicate how you want the data to be plotted!
You can further experiment with the visualization method in the DataCamp Light chunk below:
Make use of the R console to explore your data further.
package, which is the interactive grammar of graphics, take a look at DataCamp’s Machine Learning in R For Beginners tutorial or take DataCamp’s ggvis course.
Before you can build your model, you also need to make sure that your data is cleaned, normalized (if applicable) and divided into training and test sets. Since the dataset comes from the UCI Machine Learning Repository, you can expect it to already be somewhat clean, but let’s double check the quality of your data anyway.
to briefly recap what you learned when you checked whether the import of your data was successful:
Now that you’re sure that the data is clean enough, you can start by checking if the normalization is necessary for any of the data with which you’re working for this tutorial.
function. Then, you’re ready to start modeling.
A type of network that performs well on such a problem is a multi-layer perceptron. This type of neural network is often fully connected. That means that you’re looking to build a fairly simple stack of fully-connected layers to solve this problem. As for the activation functions that you will use, it’s best to use one of the most common ones here for the purpose of getting familiar with Keras and neural networks, which is the relu activation function. This rectifier activation function is used in a hidden layer, which is generally speaking a good practice.
In addition, you also see that the softmax activation function is used in the output layer. You do this because you want to make sure that the output values are in the range of 0 and 1 and may be used as predicted probabilities:
has 4 columns.
You can further inspect your model with the following functions:
to the metrics argument.
The optimizer and the loss are two arguments that are required if you want to compile the model.
Some of the most popular optimization algorithms used are the Stochastic Gradient Descent (SGD), ADAM and RMSprop. Depending on whichever algorithm you choose, you’ll need to tune certain parameters, such as learning rate or momentum. The choice for a loss function depends on the task that you have at hand: for example, for a regression problem, you’ll usually use the Mean Squared Error (MSE).
, in batches of 5 samples.
, you indicate that you want to see progress bar logging.
What you do with the code above is training the model for a specified number of epochs or exposures to the training dataset. An epoch is a single pass through the entire training set, followed by testing of the verification set. The batch size that you specify in the code above defines the number of samples that going to be propagated through the network. Also, by doing this, you optimize the efficiency because you make sure that you don’t load too many input patterns into memory at the same time.
function, like you see in this particular code chunk!
Make sure to study the plot in more detail.
At first sight, it’s no surprise that this all looks a tad messy. You might not entirely know what you’re looking at, right?
are the same metrics, loss and accuracy, for the test or validation data.
operator to access the data and plot it step by step.
Check out the DataCamp Light box below to see how you can do this:
In this first plot, you plotted the loss of the model on the training and test data. Now it’s time to also do the same, but then for the accuracy of the model:
Some things to keep in mind here are the following:
What do you think of the results? At first sight, does this model that you have created make the right predictions?
, like in the code example below:
Fine-tuning your model is probably something that you’ll be doing a lot, especially in the beginning, because not all classification and regression problems are as straightforward as the one that you saw in the first part of this tutorial. As you read above, there are already two key decisions that you’ll probably want to adjust: how many layers you’re going to use and how many “hidden units” you will chose for each layer.
In the beginning, this will really be quite a journey.
function. This section will go over these three options.
package and that is saving or exporting your model so that you can load it back in at another moment.