numpy 遗传算法选择:轮盘赌与随机通用抽样
numpy genetic algorithm selection: roulette wheel vs. stochastic universal sampling
我正在 numpy 中实现一个遗传算法,我正在尝试弄清楚如何通过轮盘赌和随机通用采样正确地实现选择。我在 Whosebug 或其他地方看到的示例使用 python 循环而不是向量化的 numpy 代码。
例如,这里是两种算法在DEAP中的实现。
def selRoulette(individuals, k, fit_attr="fitness"):
"""Select *k* individuals from the input *individuals* using *k*
spins of a roulette. The selection is made by looking only at the first
objective of each individual. The list returned contains references to
the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:returns: A list of selected individuals.
This function uses the :func:`~random.random` function from the python base
"""
s_inds = sorted(individuals, key=attrgetter(fit_attr), reverse=True)
sum_fits = sum(getattr(ind, fit_attr).values[0] for ind in individuals)
chosen = []
for i in xrange(k):
u = random.random() * sum_fits
sum_ = 0
for ind in s_inds:
sum_ += getattr(ind, fit_attr).values[0]
if sum_ > u:
chosen.append(ind)
break
return chosen
def selStochasticUniversalSampling(individuals, k, fit_attr="fitness"):
"""Select the *k* individuals among the input *individuals*.
The selection is made by using a single random value to sample all of the
individuals by choosing them at evenly spaced intervals. The list returned
contains references to the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:return: A list of selected individuals.
"""
s_inds = sorted(individuals, key=attrgetter(fit_attr), reverse=True)
sum_fits = sum(getattr(ind, fit_attr).values[0] for ind in individuals)
distance = sum_fits / float(k)
start = random.uniform(0, distance)
points = [start + i*distance for i in xrange(k)]
chosen = []
for p in points:
i = 0
sum_ = getattr(s_inds[i], fit_attr).values[0]
while sum_ < p:
i += 1
sum_ += getattr(s_inds[i], fit_attr).values[0]
chosen.append(s_inds[i])
return chosen
下面是我实现的轮盘,好像是加权采样带替换,但是替换参数我不太清楚
# population is a 2D array of integers
# population_fitness is a 1D array of float of same length as population
weights = population_fitness / population_fitness.sum()
selected = population[np.random.choice(len(population), size=n, replace=True, p=weights)]
这是我对 SUS 选择的实现。我是否正确,当在 numpy 中实现时,我唯一需要更改的是采样没有替换,或者我是否也应该删除权重?
weights = population_fitness / population_fitness.sum()
selected = population[np.random.choice(len(population), size=n, replace=False, p=weights)]
感谢任何建议!
两种策略都可能select同一个人多次,所以替换不是重点。
我不知道np.random.choice
内部是如何实现的,但无论如何函数的契约中并没有指定实现方法(所以它随时可能改变)。下面我给出了我使用 numpy 实现的两种 selection 策略。
请务必先测试这些功能,然后再将它们用于任何严肃的事情。
编辑: 不需要按适应度排序;我不知道我在想什么。
import numpy as np
formatters = {
'int': lambda x: '%4d' % x,
'float': lambda x: '%.02f' % x
}
def print_report(population, fitness, wheel, selectors, selected_individuals):
with np.printoptions(formatter=formatters):
print(' Population:', population)
print(' Fitness:', fitness)
print(' Roulette wheel:', wheel) # fitness cumulative sum
print(' Sampled values:', selectors) # roulette "extractions"
print('Selected individuals:', selected_individuals)
# This should be equivalent to np.choice(population, size, weights=fitness)
def roulette_wheel_selection(rng: np.random.Generator,
population: np.ndarray,
fitness: np.ndarray,
size: int) -> np.ndarray:
""" :Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum() # the "roulette wheel"
fitness_sum = fitness_cumsum[-1] # sum of all fitness values (size of the wheel)
sampled_values = rng.random(size) * fitness_sum
# For each sampled value, get the corresponding roulette wheel slot
selected = np.searchsorted(fitness_cumsum, sampled_values)
print_report(population, fitness, fitness_cumsum, sampled_values, selected)
return selected
def sus(rng: np.random.Generator,
population: np.ndarray,
fitness: np.ndarray,
size: int) -> np.ndarray:
""" https://en.wikipedia.org/wiki/Stochastic_universal_sampling
:Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum()
fitness_sum = fitness_cumsum[-1] # the "roulette wheel"
step = fitness_sum / size # we'll move by this amount in the wheel
start = rng.random() * step # sample a start point in [0, step)
# get N evenly-spaced points in the wheel
selectors = np.arange(start, fitness_sum, step)
selected = np.searchsorted(fitness_cumsum, selectors)
print_report(population, fitness, fitness_cumsum, selectors, selected)
return selected
if __name__ == "__main__":
from numpy.random import default_rng
n = 10
sample_size = 5
rng = default_rng()
# Random population data.
# I'm sorting by fitness just for making it easier to read the report
population = np.arange(n)
fitness = np.sort(
np.abs(rng.normal(size=len(population)))
)
print('Roulette wheel sampling:')
roulette_wheel_selection(rng, population, fitness, sample_size)
print()
print('SUS:')
sus(rng, population, fitness, sample_size)
输出:
Roulette wheel sampling:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [0.93 3.93 5.32 7.10 4.11]
Selected individuals: [ 2 7 8 9 7]
SUS:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [1.25 2.79 4.33 5.86 7.40]
Selected individuals: [ 3 5 7 9 9]
我正在 numpy 中实现一个遗传算法,我正在尝试弄清楚如何通过轮盘赌和随机通用采样正确地实现选择。我在 Whosebug 或其他地方看到的示例使用 python 循环而不是向量化的 numpy 代码。
例如,这里是两种算法在DEAP中的实现。
def selRoulette(individuals, k, fit_attr="fitness"):
"""Select *k* individuals from the input *individuals* using *k*
spins of a roulette. The selection is made by looking only at the first
objective of each individual. The list returned contains references to
the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:returns: A list of selected individuals.
This function uses the :func:`~random.random` function from the python base
"""
s_inds = sorted(individuals, key=attrgetter(fit_attr), reverse=True)
sum_fits = sum(getattr(ind, fit_attr).values[0] for ind in individuals)
chosen = []
for i in xrange(k):
u = random.random() * sum_fits
sum_ = 0
for ind in s_inds:
sum_ += getattr(ind, fit_attr).values[0]
if sum_ > u:
chosen.append(ind)
break
return chosen
def selStochasticUniversalSampling(individuals, k, fit_attr="fitness"):
"""Select the *k* individuals among the input *individuals*.
The selection is made by using a single random value to sample all of the
individuals by choosing them at evenly spaced intervals. The list returned
contains references to the input *individuals*.
:param individuals: A list of individuals to select from.
:param k: The number of individuals to select.
:param fit_attr: The attribute of individuals to use as selection criterion
:return: A list of selected individuals.
"""
s_inds = sorted(individuals, key=attrgetter(fit_attr), reverse=True)
sum_fits = sum(getattr(ind, fit_attr).values[0] for ind in individuals)
distance = sum_fits / float(k)
start = random.uniform(0, distance)
points = [start + i*distance for i in xrange(k)]
chosen = []
for p in points:
i = 0
sum_ = getattr(s_inds[i], fit_attr).values[0]
while sum_ < p:
i += 1
sum_ += getattr(s_inds[i], fit_attr).values[0]
chosen.append(s_inds[i])
return chosen
下面是我实现的轮盘,好像是加权采样带替换,但是替换参数我不太清楚
# population is a 2D array of integers
# population_fitness is a 1D array of float of same length as population
weights = population_fitness / population_fitness.sum()
selected = population[np.random.choice(len(population), size=n, replace=True, p=weights)]
这是我对 SUS 选择的实现。我是否正确,当在 numpy 中实现时,我唯一需要更改的是采样没有替换,或者我是否也应该删除权重?
weights = population_fitness / population_fitness.sum()
selected = population[np.random.choice(len(population), size=n, replace=False, p=weights)]
感谢任何建议!
两种策略都可能select同一个人多次,所以替换不是重点。
我不知道np.random.choice
内部是如何实现的,但无论如何函数的契约中并没有指定实现方法(所以它随时可能改变)。下面我给出了我使用 numpy 实现的两种 selection 策略。
请务必先测试这些功能,然后再将它们用于任何严肃的事情。
编辑: 不需要按适应度排序;我不知道我在想什么。
import numpy as np
formatters = {
'int': lambda x: '%4d' % x,
'float': lambda x: '%.02f' % x
}
def print_report(population, fitness, wheel, selectors, selected_individuals):
with np.printoptions(formatter=formatters):
print(' Population:', population)
print(' Fitness:', fitness)
print(' Roulette wheel:', wheel) # fitness cumulative sum
print(' Sampled values:', selectors) # roulette "extractions"
print('Selected individuals:', selected_individuals)
# This should be equivalent to np.choice(population, size, weights=fitness)
def roulette_wheel_selection(rng: np.random.Generator,
population: np.ndarray,
fitness: np.ndarray,
size: int) -> np.ndarray:
""" :Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum() # the "roulette wheel"
fitness_sum = fitness_cumsum[-1] # sum of all fitness values (size of the wheel)
sampled_values = rng.random(size) * fitness_sum
# For each sampled value, get the corresponding roulette wheel slot
selected = np.searchsorted(fitness_cumsum, sampled_values)
print_report(population, fitness, fitness_cumsum, sampled_values, selected)
return selected
def sus(rng: np.random.Generator,
population: np.ndarray,
fitness: np.ndarray,
size: int) -> np.ndarray:
""" https://en.wikipedia.org/wiki/Stochastic_universal_sampling
:Authors: Gianluca Gippetto """
if size > len(population):
raise ValueError
fitness_cumsum = fitness.cumsum()
fitness_sum = fitness_cumsum[-1] # the "roulette wheel"
step = fitness_sum / size # we'll move by this amount in the wheel
start = rng.random() * step # sample a start point in [0, step)
# get N evenly-spaced points in the wheel
selectors = np.arange(start, fitness_sum, step)
selected = np.searchsorted(fitness_cumsum, selectors)
print_report(population, fitness, fitness_cumsum, selectors, selected)
return selected
if __name__ == "__main__":
from numpy.random import default_rng
n = 10
sample_size = 5
rng = default_rng()
# Random population data.
# I'm sorting by fitness just for making it easier to read the report
population = np.arange(n)
fitness = np.sort(
np.abs(rng.normal(size=len(population)))
)
print('Roulette wheel sampling:')
roulette_wheel_selection(rng, population, fitness, sample_size)
print()
print('SUS:')
sus(rng, population, fitness, sample_size)
输出:
Roulette wheel sampling:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [0.93 3.93 5.32 7.10 4.11]
Selected individuals: [ 2 7 8 9 7]
SUS:
Population: [ 0 1 2 3 4 5 6 7 8 9]
Fitness: [0.34 0.35 0.47 0.61 0.62 0.67 0.73 0.84 1.12 1.93]
Roulette wheel: [0.34 0.69 1.16 1.77 2.39 3.06 3.79 4.64 5.75 7.69]
Sampled values: [1.25 2.79 4.33 5.86 7.40]
Selected individuals: [ 3 5 7 9 9]