Which machine learning algorithm should I use?

Which #MachineLearning Algorithm should I use?  #abdsc #BigData #DataScience #algorithms

  • A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is “which algorithm should I use?”
  • The answer to the question varies depending on many factors, including:

    Even an experienced data scientist cannot tell which algorithm will perform the best before trying different algorithms.

  • We are not advocating a one and done approach, but we do hope to provide some guidance on which algorithms to try first depending on some clear factors.
  • The article describes when using one of the following algorithms:

By Hui Li, Principal Staff Scientist, Data Science, at SAS.
A typical question asked by a beginner, when facing a wide variety of machine learning algorithms,…

@KirkDBorne: Which #MachineLearning Algorithm should I use? #abdsc #BigData #DataScience #algorithms

By Hui Li, Principal Staff Scientist, Data Science, at SAS.

A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is “which algorithm should I use?” The answer to the question varies depending on many factors, including:

Even an experienced data scientist cannot tell which algorithm will perform the best before trying different algorithms. We are not advocating a one and done approach, but we do hope to provide some guidance on which algorithms to try first depending on some clear factors.

The machine learning algorithm cheat sheet

Click on the picture below to zoom in. 

To read more, click here. 

The article describes when using one of the following algorithms:

DSC Resources

Popular Articles

Thanks or attempting this to Dr. Li. Obviously, it’s a complex topic. I would be interested in her thoughts on how often the choice of algorithm matters. Which is more productive: improving features (variables), tuning Algorithm X, or trying Algorithm Y.

  I also noticed what appeared to be several bugs in the text. For example, 

When most dependent variables are numeric, logistic regression and SVM should be the first try for classification. These models are easy to implement, their parameters easy to tune, and the performances are also pretty good. So these models are appropriate for beginners.”

I think this should say “when most independent variables…” With continuous dependent variables, you cannot even use logistic regression. (I was unable to put this comment on the source page.) 

My manager is a data steward that needs to understand the algorithm to better understand the data. He needs to do this to develop the intuitiveness that the data is right.

Unlike things before, we cannot black box algorithms that process data. The moment we do, the plans are wrong, bad decisions are made, and we spend a lot of time chasing and investing in phantoms.

Which machine learning algorithm should I use?

You might also like More from author

Comments are closed, but trackbacks and pingbacks are open.