A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
          reinforcement-learning          nethack          mcts          flax          model-based-rl          jax          model-based-reinforcement-learning          muzero          minihack          rlax      
    - 
            Updated
            Sep 19, 2022 
- Python