OptCV#
- class OptCV(estimator, optimizer, *, scoring: Callable | str | None = None, refit: bool = True, cv=None)[source]#
Tuning an sklearn estimator via any optimizer in the hyperactive toolbox.
OptCVuses any available tuning engine fromhyperactiveto tune an sklearn estimator via cross-validation.It passes cross-validation results as scores to the tuning engine, which identifies the best hyperparameters.
Any available tuning engine from hyperactive can be used, for example:
grid search -
from hyperactive.opt import GridSearchSk as GridSearch, this results in the same algorithm asGridSearchCVhill climbing -
from hyperactive.opt import HillClimbingoptuna parzen-tree search -
from hyperactive.opt.optuna import TPEOptimizer
Configuration of the tuning engine is as per the respective documentation.
Formally,
OptCVdoes the following:In
fit:wraps the
estimator,scoring, and other parameters into aSklearnCvExperimentinstance, which is passed to the optimizeroptimizeras theexperimentargument.Optimal parameters are then obtained from
optimizer.solve, and set asbest_params_andbest_estimator_attributes.If
refit=True,best_estimator_is fitted to the entireXandy.
In
predictandpredict-like methods, calls the respective method of thebest_estimator_ifrefit=True.- Parameters:
- estimatorsklearn BaseEstimator
The estimator to be tuned.
- optimizerhyperactive BaseOptimizer
The optimizer to be used for hyperparameter search.
- scoringcallable or str, default = accuracy_score or mean_squared_error
sklearn scoring function or metric to evaluate the model’s performance. Default is determined by the type of estimator:
accuracy_scorefor classifiers, andmean_squared_errorfor regressors, as per sklearn convention through the defaultscoremethod of the estimator.- refit: bool, optional, default = True
Whether to refit the best estimator with the entire dataset. If True, the best estimator is refit with the entire dataset after the optimization process. If False, does not refit, and predict is not available.
- cvint or cross-validation generator, default = KFold(n_splits=3, shuffle=True)
The number of folds or cross-validation strategy to be used. If int, the cross-validation used is KFold(n_splits=cv, shuffle=True).
- Attributes:
classes_Classes function.
fit_successfulFit Successful function.
Methods
Call decision_function on the estimator with the best found parameters.
fit(X, y, **fit_params)Fit the model.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
inverse_transform([X, Xt])Call inverse_transform on the estimator with the best found params.
predict(X)Call predict on the estimator with the best found parameters.
Call predict_log_proba on the estimator with the best found parameters.
Call predict_proba on the estimator with the best found parameters.
score(X[, y])Return the score on the given data, if the estimator has been refit.
Call score_samples on the estimator with the best found parameters.
set_params(**params)Set the parameters of this estimator.
transform(X)Call transform on the estimator with the best found parameters.
Mark fit successful and preserve signature.
- fit(X, y, **fit_params)[source]#
Fit the model.
- Parameters:
- X{array-like, sparse matrix} of shape (n_samples, n_features)
Training data.
- yarray-like of shape (n_samples,) or (n_samples, n_targets)
Target values. Will be cast to X’s dtype if necessary.
- Returns:
- selfobject
Fitted Estimator.
- score(X, y=None, **params)[source]#
Return the score on the given data, if the estimator has been refit.
This uses the score defined by
scoringwhere provided, and thebest_estimator_.scoremethod otherwise.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input data, where n_samples is the number of samples and n_features is the number of features.
- yarray-like of shape (n_samples, n_output) or (n_samples,), default=None
Target relative to X for classification or regression; None for unsupervised learning.
- **paramsdict
Parameters to be passed to the underlying scorer(s).
- Returns:
- scorefloat
The score defined by
scoringif provided, and thebest_estimator_.scoremethod otherwise.
- decision_function(X)[source]#
Call decision_function on the estimator with the best found parameters.
Only available if
refit=Trueand the underlying estimator supportsdecision_function.- Parameters:
- Xindexable, length n_samples
Must fulfill the input assumptions of the underlying estimator.
- Returns:
- y_scorendarray of shape (n_samples,) or (n_samples, n_classes) or (n_samples, n_classes * (n_classes-1) / 2)
Result of the decision function for X based on the estimator with the best found parameters.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- inverse_transform(X=None, Xt=None)[source]#
Call inverse_transform on the estimator with the best found params.
Only available if the underlying estimator implements
inverse_transformandrefit=True.- Parameters:
- Xindexable, length n_samples
Data in the transformed space. Must fulfill the input assumptions of the underlying estimator.
- Xtarray-like of shape (n_samples, n_features), optional
Deprecated in scikit-learn 1.2 and removed in 1.7. Use
Xinstead. The former parameter name for the transformed data.
- Returns:
- X_original{ndarray, sparse matrix} of shape (n_samples, n_features)
Result of the inverse_transform function for X based on the estimator with the best found parameters.
- predict(X)[source]#
Call predict on the estimator with the best found parameters.
Only available if
refit=Trueand the underlying estimator supportspredict.- Parameters:
- Xindexable, length n_samples
Must fulfill the input assumptions of the underlying estimator.
- Returns:
- y_predndarray of shape (n_samples,)
The predicted labels or values for X based on the estimator with the best found parameters.
- predict_log_proba(X)[source]#
Call predict_log_proba on the estimator with the best found parameters.
Only available if
refit=Trueand the underlying estimator supportspredict_log_proba.- Parameters:
- Xindexable, length n_samples
Must fulfill the input assumptions of the underlying estimator.
- Returns:
- y_predndarray of shape (n_samples,) or (n_samples, n_classes)
Predicted class log-probabilities for X based on the estimator with the best found parameters. The order of the classes corresponds to that in the fitted attribute classes_.
- predict_proba(X)[source]#
Call predict_proba on the estimator with the best found parameters.
Only available if
refit=Trueand the underlying estimator supportspredict_proba.- Parameters:
- Xindexable, length n_samples
Must fulfill the input assumptions of the underlying estimator.
- Returns:
- y_predndarray of shape (n_samples,) or (n_samples, n_classes)
Predicted class probabilities for X based on the estimator with the best found parameters. The order of the classes corresponds to that in the fitted attribute classes_.
- score_samples(X)[source]#
Call score_samples on the estimator with the best found parameters.
Only available if
refit=Trueand the underlying estimator supportsscore_samples.- Parameters:
- Xiterable
Data to predict on. Must fulfill input requirements of the underlying estimator.
- Returns:
- y_scorendarray of shape (n_samples,)
Score per sample for X based on the estimator with the best found parameters (e.g. log-likelihood, anomaly score).
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- transform(X)[source]#
Call transform on the estimator with the best found parameters.
Only available if the underlying estimator supports
transformandrefit=True.- Parameters:
- Xindexable, length n_samples
Must fulfill the input assumptions of the underlying estimator.
- Returns:
- Xt{ndarray, sparse matrix} of shape (n_samples, n_features)
X transformed in the new space based on the estimator with the best found parameters.