OptCV#

class OptCV(estimator, optimizer, *, scoring: Callable | str | None = None, refit: bool = True, cv=None)[source]#

Tuning an sklearn estimator via any optimizer in the hyperactive toolbox.

OptCV uses any available tuning engine from hyperactive to tune an sklearn estimator via cross-validation.

It passes cross-validation results as scores to the tuning engine, which identifies the best hyperparameters.

Any available tuning engine from hyperactive can be used, for example:

grid search - from hyperactive.opt import GridSearchSk as GridSearch, this results in the same algorithm as GridSearchCV
hill climbing - from hyperactive.opt import HillClimbing
optuna parzen-tree search - from hyperactive.opt.optuna import TPEOptimizer

Configuration of the tuning engine is as per the respective documentation.

Formally, OptCV does the following:

In fit:

wraps the estimator, scoring, and other parameters into a SklearnCvExperiment instance, which is passed to the optimizer optimizer as the experiment argument.
Optimal parameters are then obtained from optimizer.solve, and set as best_params_ and best_estimator_ attributes.
If refit=True, best_estimator_ is fitted to the entire X and y.

In predict and predict-like methods, calls the respective method of the best_estimator_ if refit=True.

Parameters:

estimatorsklearn BaseEstimator: The estimator to be tuned.
optimizerhyperactive BaseOptimizer: The optimizer to be used for hyperparameter search.
scoringcallable or str, default = accuracy_score or mean_squared_error: sklearn scoring function or metric to evaluate the model’s performance. Default is determined by the type of estimator: accuracy_score for classifiers, and mean_squared_error for regressors, as per sklearn convention through the default score method of the estimator.
refit: bool, optional, default = True: Whether to refit the best estimator with the entire dataset. If True, the best estimator is refit with the entire dataset after the optimization process. If False, does not refit, and predict is not available.
cvint or cross-validation generator, default = KFold(n_splits=3, shuffle=True): The number of folds or cross-validation strategy to be used. If int, the cross-validation used is KFold(n_splits=cv, shuffle=True).

Attributes:

classes_: Classes function.
fit_successful: Fit Successful function.

Methods

`decision_function`(X)	Call decision_function on the estimator with the best found parameters.
`fit`(X, y, **fit_params)	Fit the model.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`inverse_transform`([X, Xt])	Call inverse_transform on the estimator with the best found params.
`predict`(X)	Call predict on the estimator with the best found parameters.
`predict_log_proba`(X)	Call predict_log_proba on the estimator with the best found parameters.
`predict_proba`(X)	Call predict_proba on the estimator with the best found parameters.
`score`(X[, y])	Return the score on the given data, if the estimator has been refit.
`score_samples`(X)	Call score_samples on the estimator with the best found parameters.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Call transform on the estimator with the best found parameters.
`verify_fit`()	Mark fit successful and preserve signature.

fit(X, y, **fit_params)[source]#

Fit the model.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training data.
yarray-like of shape (n_samples,) or (n_samples, n_targets): Target values. Will be cast to X’s dtype if necessary.

Returns:

selfobject: Fitted Estimator.

score(X, y=None, **params)[source]#

Return the score on the given data, if the estimator has been refit.

This uses the score defined by scoring where provided, and the best_estimator_.score method otherwise.

Parameters:

Xarray-like of shape (n_samples, n_features): Input data, where n_samples is the number of samples and n_features is the number of features.
yarray-like of shape (n_samples, n_output) or (n_samples,), default=None: Target relative to X for classification or regression; None for unsupervised learning.
**paramsdict: Parameters to be passed to the underlying scorer(s).

Returns:

scorefloat: The score defined by scoring if provided, and the best_estimator_.score method otherwise.

property fit_successful[source]#: Fit Successful function.

property classes_[source]#: Classes function.

decision_function(X)[source]#

Call decision_function on the estimator with the best found parameters.

Only available if refit=True and the underlying estimator supports decision_function.

Parameters:

Xindexable, length n_samples: Must fulfill the input assumptions of the underlying estimator.

Returns:

y_scorendarray of shape (n_samples,) or (n_samples, n_classes) or (n_samples, n_classes * (n_classes-1) / 2): Result of the decision function for X based on the estimator with the best found parameters.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

inverse_transform(X=None, Xt=None)[source]#

Call inverse_transform on the estimator with the best found params.

Only available if the underlying estimator implements inverse_transform and refit=True.

Parameters:

Xindexable, length n_samples: Data in the transformed space. Must fulfill the input assumptions of the underlying estimator.
Xtarray-like of shape (n_samples, n_features), optional: Deprecated in scikit-learn 1.2 and removed in 1.7. Use X instead. The former parameter name for the transformed data.

Returns:

X_original{ndarray, sparse matrix} of shape (n_samples, n_features): Result of the inverse_transform function for X based on the estimator with the best found parameters.

predict(X)[source]#

Call predict on the estimator with the best found parameters.

Only available if refit=True and the underlying estimator supports predict.

Parameters:

Xindexable, length n_samples: Must fulfill the input assumptions of the underlying estimator.

Returns:

y_predndarray of shape (n_samples,): The predicted labels or values for X based on the estimator with the best found parameters.

predict_log_proba(X)[source]#

Call predict_log_proba on the estimator with the best found parameters.

Only available if refit=True and the underlying estimator supports predict_log_proba.

Parameters:

Xindexable, length n_samples: Must fulfill the input assumptions of the underlying estimator.

Returns:

y_predndarray of shape (n_samples,) or (n_samples, n_classes): Predicted class log-probabilities for X based on the estimator with the best found parameters. The order of the classes corresponds to that in the fitted attribute classes_.

predict_proba(X)[source]#

Call predict_proba on the estimator with the best found parameters.

Only available if refit=True and the underlying estimator supports predict_proba.

Parameters:

Xindexable, length n_samples: Must fulfill the input assumptions of the underlying estimator.

Returns:

y_predndarray of shape (n_samples,) or (n_samples, n_classes): Predicted class probabilities for X based on the estimator with the best found parameters. The order of the classes corresponds to that in the fitted attribute classes_.

score_samples(X)[source]#

Call score_samples on the estimator with the best found parameters.

Only available if refit=True and the underlying estimator supports score_samples.

Parameters:

Xiterable: Data to predict on. Must fulfill input requirements of the underlying estimator.

Returns:

y_scorendarray of shape (n_samples,): Score per sample for X based on the estimator with the best found parameters (e.g. log-likelihood, anomaly score).

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

transform(X)[source]#

Call transform on the estimator with the best found parameters.

Only available if the underlying estimator supports transform and refit=True.

Parameters:

Xindexable, length n_samples: Must fulfill the input assumptions of the underlying estimator.

Returns:

Xt{ndarray, sparse matrix} of shape (n_samples, n_features): X transformed in the new space based on the estimator with the best found parameters.

verify_fit()[source]#: Mark fit successful and preserve signature.