Search Spaces#

The search space defines all configurations the optimizer can explore. Too narrow and you miss good solutions; too broad and the optimizer wastes iterations. In Hyperactive, search spaces are dictionaries mapping parameter names to lists of possible values.

What is a Search Space?#

A search space defines all possible parameter combinations the optimizer can explore. It’s simply a dictionary mapping parameter names to lists of values.

search_space = {
    "learning_rate": [0.001, 0.01, 0.1],      # 3 values
    "n_estimators": [50, 100, 200, 500],      # 4 values
    "max_depth": [3, 5, 10, None],            # 4 values
}
# Total: 3 × 4 × 4 = 48 combinations

Parameter Types#

Hyperactive supports any value that can be stored in a Python list:

Categorical

Discrete choices like strings or objects.

"kernel": ["linear", "rbf", "poly"]
"optimizer": [Adam, SGD, RMSprop]

Integer

Discrete numeric values.

"n_estimators": [50, 100, 200]
"hidden_size": list(range(32, 257, 32))

Continuous

Float values (discretized into steps).

"dropout": np.linspace(0, 0.5, 11)
"learning_rate": np.logspace(-4, -1, 20)

Linear vs Logarithmic Spacing#

The spacing between values matters. Choose based on how the parameter affects your objective.

Linear Spacing

Use when equal differences have equal effects.

# Dropout: 0.1 → 0.2 has similar effect as 0.4 → 0.5
"dropout": np.linspace(0.0, 0.5, 11).tolist()
# [0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5]

Log Spacing

Use when the parameter spans orders of magnitude.

# Learning rate: 0.001 → 0.01 is as significant as 0.01 → 0.1
"learning_rate": np.logspace(-4, -1, 10).tolist()
# [0.0001, 0.00028, 0.00077, 0.00215, 0.00599, 0.01668, 0.04642, 0.1]

Common parameters that benefit from log spacing:

Learning rates (1e-5 to 1e-1)
Regularization strength (1e-6 to 1e1)
Batch sizes (powers of 2)

Granularity and Size#

More values per parameter means more combinations to explore.

The Multiplication Effect

With 3 parameters, each having 10 values: 10 × 10 × 10 = 1,000 combinations

With 3 parameters, each having 100 values: 100 × 100 × 100 = 1,000,000 combinations

Calculate your search space size:

from functools import reduce
import operator

total = reduce(operator.mul, [len(v) for v in search_space.values()])
print(f"Total combinations: {total:,}")

Size recommendations:

Size	Approach	Recommended Optimizers
< 1,000	Exhaustive or random search	`GridSearch`, `RandomSearch`
1,000 - 100,000	Smart sampling	`BayesianOptimizer`, `TPE`
100,000 - 10M	Population or local methods	`ParticleSwarm`, `HillClimbing`
> 10M	Reduce space or use iterative refinement	Start coarse, then refine

Tip

Start with a coarse search space (fewer values per parameter), then refine around the best region found.

Common Patterns#

Ready-to-use search spaces for common ML models:

Random Forest

rf_space = {
    "n_estimators": [50, 100, 200, 500],
    "max_depth": [None, 5, 10, 20, 30],
    "min_samples_split": [2, 5, 10],
    "min_samples_leaf": [1, 2, 4],
    "max_features": ["sqrt", "log2", None],
}

Gradient Boosting

import numpy as np

gb_space = {
    "n_estimators": [50, 100, 200, 500],
    "learning_rate": np.logspace(-3, 0, 10).tolist(),
    "max_depth": [3, 5, 7, 9],
    "subsample": np.linspace(0.6, 1.0, 5).tolist(),
}

SVM

import numpy as np

svm_space = {
    "C": np.logspace(-2, 2, 10).tolist(),
    "gamma": np.logspace(-4, -1, 10).tolist(),
    "kernel": ["rbf", "poly", "sigmoid"],
}

Neural Network

import numpy as np

nn_space = {
    "hidden_layers": [1, 2, 3],
    "hidden_size": [32, 64, 128, 256],
    "learning_rate": np.logspace(-4, -2, 20).tolist(),
    "dropout": np.linspace(0.0, 0.5, 6).tolist(),
    "batch_size": [16, 32, 64, 128],
    "activation": ["relu", "tanh", "elu"],
}

Parameter Dependencies#

Sometimes parameters have constraints or dependencies. Handle these in your experiment function:

import numpy as np

def experiment(params):
    # Constraint: min_samples_split >= min_samples_leaf
    if params["min_samples_split"] < params["min_samples_leaf"]:
        return -np.inf  # Invalid configuration

    # Constraint: degree only relevant for poly kernel
    if params["kernel"] != "poly" and params["degree"] != 3:
        return -np.inf

    # Valid configuration
    return evaluate_model(params)

Note

Returning -np.inf effectively removes invalid combinations from consideration. The optimizer will learn to avoid these regions.

Common Mistakes#

Overly Large Spaces

# Bad: 1000³ = 1 billion combinations
"param": np.linspace(0, 1, 1000)

# Better: 50³ = 125,000 combinations
"param": np.linspace(0, 1, 50)

Wrong Spacing

# Bad: poor coverage of small values
"lr": np.linspace(0.0001, 0.1, 20)

# Good: even coverage across magnitudes
"lr": np.logspace(-4, -1, 20)

Missing Values

# Bad: might miss optimal region
"max_depth": [2, 3, 4]

# Better: include None and wider range
"max_depth": [None, 3, 5, 10, 20, 50]

Too Fine Initially

# Bad for initial search
"lr": np.logspace(-4, -1, 100)

# Better: start coarse, refine later
"lr": np.logspace(-4, -1, 10)

Quick Reference#

Parameter Type	Example	When to Use
Categorical	`["rbf", "linear", "poly"]`	Distinct choices
Integer range	`list(range(10, 101, 10))`	Discrete numeric parameters
Linear float	`np.linspace(0, 1, 20).tolist()`	Uniform parameters (dropout, momentum)
Log float	`np.logspace(-4, -1, 20).tolist()`	Multi-magnitude parameters (learning rate)
Boolean	`[True, False]`	Toggle features