featuristic.selection.GeneticFeatureSelector#

class featuristic.selection.GeneticFeatureSelector(objective_function: Callable, population_size: int = 50, max_generations: int = 100, tournament_size: int = 10, crossover_proba: float = 0.9, mutation_proba: float = 0.1, early_termination_iters: int = 15, n_jobs: int = -1, pbar: bool = True, verbose: bool = False)[source]#

The Genetic Feature Selector class uses genetic programming to select the best features to minimise a given objective function. This is done by initially building a population of naive random selection of the available features. The population is then evolved over a number of generations using genetic operators such as mutation and crossover to find the best combination of features to minimise the output of the objective function.

__init__(objective_function: Callable, population_size: int = 50, max_generations: int = 100, tournament_size: int = 10, crossover_proba: float = 0.9, mutation_proba: float = 0.1, early_termination_iters: int = 15, n_jobs: int = -1, pbar: bool = True, verbose: bool = False) None[source]#

Initialize the genetic algorithm.

Parameters:
  • objective_function (callable) – The cost function to minimize. Must take X and y as input and return a float. Note that the function should return a value to minimize so a smaller value is better. If you want to maximize a metric, you should multiply the output of your objective_function by -1.

  • population_size (int) – The number of individuals in the population.

  • max_generations (int) – The maximum number of iterations.

  • crossover_proba (float) – The probability of crossover.

  • mutation_proba (float) – The probability of mutation.

  • early_termination_iters (int) – The number of iterations to wait for early termination.

  • n_jobs (int) – The number of parallel jobs to run. If -1, use all available cores else uses the minimum of n_jobs and cpu_count.

  • verbose (bool) – Whether to print progress.

Methods

__init__(objective_function[, ...])

Initialize the genetic algorithm.

fit(X, y)

Determine the optimal feature selection using a genetic algorithm.

fit_transform(X, y)

Fit the genetic algorithm and return the selected features.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

plot_history([ax])

Plot the history of the fitness function.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

transform(X[, y])

Transform the input features to the selected features.