
Generación tras generación, los humanos se han adaptado para adaptarse mejor a nuestro entorno. Empezamos como primates viviendo en un mundo de comer o ser comido. Con el tiempo, nos convertimos en quienes somos hoy, reflejando la sociedad moderna. A través del proceso de evolución nos volvemos más inteligentes. Podemos trabajar mejor con nuestro medio ambiente y lograr lo que necesitamos.
El concepto de aprendizaje a través de la evolución también se puede aplicar a la Inteligencia Artificial. Podemos entrenar a las IA para que realicen determinadas tareas utilizando NEAT, Neuroevolution of Augmented Topologies. En pocas palabras, NEAT es un algoritmo que toma un lote de IA (genomas) que intentan realizar una tarea determinada. Las IA de mejor rendimiento "se reproducen" para crear la próxima generación. Este proceso continúa hasta que tenemos una generación que es capaz de completar lo que necesita.

NEAT es asombroso porque elimina la necesidad de datos preexistentes requeridos para entrenar a nuestras IA. Usando el poder de NEAT y Gym Retro de OpenAI, entrené a una IA para jugar Sonic the Hedgehog para SEGA Genesis. ¡Aprendamos cómo!
Una red neuronal NEAT (implementación de Python)
Repositorio de GitHub
Vedant-Gupta523 / sonicNEAT
Contribuya al desarrollo de Vedant-Gupta523 / sonicNEAT creando una cuenta en GitHub. github.com
Nota: Todo el código de este artículo y el repositorio anterior es una versión ligeramente modificada del Sonic AI Bot de Lucas Thompson con los tutoriales y el código de Open-AI y NEAT YouTube.
Entendiendo OpenAI Gym
Si aún no está familiarizado con OpenAI Gym, consulte la terminología a continuación. Se utilizarán con frecuencia a lo largo del artículo.
agente - El jugador de la IA. En este caso será Sonic.
entorno: el entorno completo del agente. El entorno del juego.
acción: algo que el agente tiene la opción de hacer (es decir, moverse a la izquierda, moverse a la derecha, saltar, no hacer nada).
paso - Realización de 1 acción.
estado - Un marco del medio ambiente. La situación actual en la que se encuentra la IA.
observación: lo que la IA observa desde el medio ambiente.
fitness: qué tan bien se está desempeñando nuestra IA.
done: cuando la IA ha completado su tarea o no puede continuar más.
Instalación de dependencias
A continuación se muestran los enlaces de GitHub para OpenAI y NEAT con instrucciones de instalación.
OpenAI : //github.com/openai/retro
NEAT : //github.com/CodeReclaimers/neat-python
Pip instala bibliotecas como cv2, numpy, pickle, etc.
Importar bibliotecas y configurar el entorno
Para empezar, necesitamos importar todos los módulos que usaremos:
import retro import numpy as np import cv2 import neat import pickle
También definiremos nuestro entorno, compuesto por el juego y el estado:
env = retro.make(game = "SonicTheHedgehog-Genesis", state = "GreenHillZone.Act1")
Para entrenar a una IA para jugar Sonic the Hedgehog, necesitarás la ROM (archivo del juego) del juego. La forma más sencilla de conseguirlo es comprando el juego en Steam por $ 5. También puede encontrar descargas gratuitas de la ROM en línea, sin embargo, es ilegal, así que no lo haga.
En el repositorio de OpenAI en retro / retro / data / stable / encontrará una carpeta para Sonic the Hedgehog Genesis. Coloque aquí la ROM del juego y asegúrese de que se llame rom.md. Esta carpeta también contiene archivos .state. Puede elegir uno y establecer el parámetro de estado igual a él. Elegí GreenHillZone Act 1 ya que es el primer nivel del juego.
Comprensión de data.json y escenario.json
En la carpeta de Sonic the Hedgehog tendrás estos dos archivos:
data.json
{ "info": { "act": "address": 16776721, "type": ", "level_end_bonus": "address": 16775126, "type": ", "lives": "address": 16776722, "type": ", "rings": { "address": 16776736, "type": ">u2" }, "score": { "address": 16776742, "type": ">u4" }, "screen_x": { "address": 16774912, "type": ">u2" }, "screen_x_end": { "address": 16774954, "type": ">u2" }, "screen_y": { "address": 16774916, "type": ">u2" }, "x": { "address": 16764936, "type": ">i2" }, "y": { "address": 16764940, "type": ">u2" }, "zone": u1" } }
escenario.json
{ "done": { "variables": { "lives": { "op": "zero" } } }, "reward": { "variables": { "x": { "reward": 10.0 } } } }
Ambos archivos contienen información importante relacionada con el juego y su entrenamiento.
As it sounds, the data.json file contains information/data on different game specific variables (i.e. Sonic’s x-position, number of lives he has, etc.).
The scenario.json file allows us to perform actions in sync with the values of the data variables. For example we can reward Sonic 10.0 every time his x-position increases. We could also set our done condition to true when Sonic’s lives hit 0.
Understanding NEAT feedforward configuration
The config-feedforward file can be found in my GitHub repository linked above. It acts like a settings menu to set up our training. To point out a few simple settings:
fitness_threshold = 10000 # How fit we want Sonic to become pop_size = 20 # How many Sonics per generation num_inputs = 1120 # Number of inputs into our model num_outputs = 12 # 12 buttons on Genesis controller
There are tons of settings you can experiment with to see how it effects your AI’s training! To learn more about NEAT and the different settings in the feedfoward configuration, I would highly recommend reading the documentation here
Putting it all together: Creating the Training File
Setting up configuration
Our feedforward configuration is defined and stored in the variable config.
config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction, neat.DefaultSpeciesSet, neat.DefaultStagnation, 'config-feedforward')
Creating a function to evaluate each genome
We start by creating the function, eval_genomes, which will evaluate our genomes (a genome could be compared to 1 Sonic in a population of Sonics). For each genome we reset the environment and take a random action
for genome_id, genome in genomes: ob = env.reset() ac = env.action_space.sample()
We will also record the game environment’s length and width and color. We divide the length and width by 8.
inx, iny, inc = env.observation_space.shape inx = int(inx/8) iny = int(iny/8)
We create a recurrent neural network (RNN) using the NEAT library and input the genome and our chosen configuration.
net = neat.nn.recurrent.RecurrentNetwork.create(genome, config)
Finally, we define a few variables: current_max_fitness (the highest fitness in the current population), fitness_current (the current fitness of the genome), frame (the frame count), counter (to count the number of steps our agent takes), xpos (the x-position of Sonic), and done (whether or not we have reached our fitness goal).
current_max_fitness = 0 fitness_current = 0 frame = 0 counter = 0 xpos = 0 done = False
While we have not reached our done requirement, we need to run the environment, increment our frame counter, and shape our observation to mimic that of the game (still for each genome).
env.render() frame += 1 ob = cv2.resize(ob, (inx, iny)) ob = cv2.cvtColor(ob, cv2.COLOR_BGR2GRAY) ob = np.reshape(ob, (inx,iny))
We will take our observation and put it in a one-dimensional array, so that our RNN can understand it. We receive our output by feeding this array to our RNN.
imgarray = [] imgarray = np.ndarray.flatten(ob) nnOutput = net.activate(imgarray)
Using the output from the RNN our AI takes a step. From this step we can extract fresh information: a new observation, a reward, whether or not we have reached our done requirement, and information on variables in our data.json (info).
ob, rew, done, info = env.step(nnOutput)
At this point we need to evaluate our genome’s fitness and whether or not it has met the done requirement.
We look at our “x” variable from data.json and check if it has surpassed the length of the level. If it has, we will increase our fitness by our fitness threshold signifying we are done.
xpos = info['x'] if xpos >= 10000: fitness_current += 10000 done = True
Otherwise, we will increase our current fitness by the reward we earned from performing the step. We also check if we have a new highest fitness and adjust the value of our current_max_fitness accordingly.
fitness_current += rew if fitness_current > current_max_fitness: current_max_fitness = fitness_current counter = 0 else: counter += 1
Lastly, we check if we are done or if our genome has taken 250 steps. If so, we print information on the genome which was simulated. Otherwise we keep looping until one of the two requirements has been satisfied.
if done or counter == 250: done = True print(genome_id, fitness_current) genome.fitness = fitness_current
Defining the population, printing training stats, and more
The absolute last thing we need to do is define our population, print out statistics from our training, save checkpoints (in case you want to pause and resume training), and pickle our winning genome.
p = neat.Population(config) p.add_reporter(neat.StdOutReporter(True)) stats = neat.StatisticsReporter() p.add_reporter(stats) p.add_reporter(neat.Checkpointer(1)) winner = p.run(eval_genomes) with open('winner.pkl', 'wb') as output: pickle.dump(winner, output, 1)
All that’s left is the matter of running the program and watching Sonic slowly learn how to beat the level!


To see all of the code put together check out the Training.py file in my GitHub repository.
Bonus: Parallel Training
If you have a multi-core CPU you can run multiple training simulations at once, exponentially increasing the rate at which you can train your AI! Although I will not go through the specifics on how to do this in this article, I highly suggest you check the sonicTraning.py implementation in my GitHub repository.
Conclusion
That’s all there is to it! With a few adjustments, this framework is applicable to any game for the NES, SNES, SEGA Genesis, and more. If you have any questions or you just want to say hello, feel free to email me at vedantgupta523[at]gmail[dot]com ?
Also, be sure to check out Lucas Thompson's Sonic AI Bot Using Open-AI and NEAT YouTube tutorials and code to see what originally inspired this article.
Key Takeaways
- Neuroevolution of Augmenting Topologies (NEAT) is an algorithm used to train AI to perform certain tasks. It is modeled after genetic evolution.
- NEAT eliminates the need for pre-existing data when training AI.
- The process of implementing OpenAI and NEAT usingPython to train an AI to play any game.