AUTO ML — HyperOpt Sklearn
Designed for Large Scale Auto ML Model Optimization via Bayesian Optimization
Hi there, I brought you again something new. Another way to Automate the ML process.
Let’s get started shall we?
But before that let me recap the definition of AutoML.
Automated Machine Learning (Auto ML) in brief refers to the automation of the whole machine learning process to find the best suited model for prediction.
Now, this method we will call as HyperOpt for the popular Scikit-Learn machine learning library. An another extension of HyperOpt is HyperOpt-sklearn.
It performs an automatic search of data preparation methods, machine learning algorithms, and model hyperparameters for both classification and regression tasks.
Now to get started let’s first install
install hpyderopt library
pip install hyperopt
pip show hyperopt
Next, we must install the HyperOpt-Sklearn library.
pip install git+https://github.com/hyperopt/hyperopt-sklearn.git
pip show hpsklearn
Then we will create a dummy classification dataset using make_classification
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from hpsklearn import HyperoptEstimator
from hpsklearn import any_classifier
from hpsklearn import any_preprocessing
from hyperopt import tpe#dataset
X, y = make_classification(n_samples=100, n_features=10, n_informative=5, n_redundant=5, random_state=5)X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
Now we will define the HyperOpt using HyperoptEstimator
model = HyperoptEstimator(classifier=any_classifier('cla'), preprocessing=any_preprocessing('pre'), algo=tpe.suggest, max_evals=50, trial_timeout=30)#perform the search
model.fit(X_train, y_train)
classifier = any_classifier(‘cla’) refers to classification , for regressor classifier = any_regressor(‘reg’), the same goes with preprocessing = any_preprocessing(‘pre) which will use search for the best from built-in list of preprocessing methods or you can provide list of preprocessing methods.
algo refers the algorithm. There are many optimization algorithms are available including Random Search
- Tree of Parzen Estimators
- Annealing
- Tree
- Gaussian Process Tree
However Tree of Parzen Estimators is a good default, you can learn more about them in the original paper provided in the link https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf
Our model is complete and time to check the accuracy of the best searched model with the model parameters.
Well, we can observe the best model of our dataset is KNeighbors, we can also see the preprocessing “preprocs” steps it took.
Pretty straightforward. Neat and Clean!
What are you waiting for? try it.
Did you enjoyed it? if so let me know…….do browse my other AutoML articles, i guarantee you will like them too. See you soon with another interesting topic.
Some of my alternative internet presences are Facebook, Instagram, Udemy, Blogger, Issuu, and more.
Also available on Quora @ https://www.quora.com/profile/Bob-Rupak-Roy