ppo-pytorch

implementation of ppo(proximal policy optimization) using pytorch

training result

red line represents the goal of the environment, specified by open ai gym

note that not all of these goals are reached,
but does achieve similar results to figure 3 of the original paper,
and better than results in Benchmarks for Spinning Up Implementations.

note that I didn't specify seed, so you may get a different result,
however, according to my experience, this code could achieve similar results across different seeds,
so you can get a result that is not so bad after trying with a few seeds(or even not specified).

Pendulum-v0

python main.py --env-name "Pendulum-v0" --learning-rate 0.0003 --learn-interval 1000 --batch-size 200 --total-steps 300000 --num-process 3

reward and running reward	multiple running rewards

HalfCheetah-v3

python main.py --env-name "HalfCheetah-v3" --total-steps 5000000 --learn-interval 2000 --learning-rate 0.0007 --batch-size 2000

reward and running reward	multiple running rewards

Swimmer-v3

python main.py --env-name "Swimmer-v3" --total-steps 1000000 --learn-interval 2000 --learning-rate 0.0005 --batch-size 1000 --std-decay

reward and running reward	multiple running rewards

Hopper-v3

python main.py --env-name "Hopper-v3" --total-steps 5000000 --learn-interval 2000 --learning-rate 0.0005 --batch-size 1000 --std-decay

reward and running reward	multiple running rewards

Walker2d-v3

python main.py --env-name "Walker2d-v3" --total-steps 5000000 --learn-interval 2000 --learning-rate 0.0005 --batch-size 1000 --std-decay

reward and running reward	multiple running rewards

reference

todo

discrete action

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
results		results
.gitignore		.gitignore
ActorCritic.py		ActorCritic.py
Buffer.py		Buffer.py
analyze.py		analyze.py
arguments.py		arguments.py
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ppo-pytorch

training result

Pendulum-v0

HalfCheetah-v3

Swimmer-v3

Hopper-v3

Walker2d-v3

reference

todo

About

Languages

Git-123-Hub/PPO-pytorch

Folders and files

Latest commit

History

Repository files navigation

ppo-pytorch

training result

Pendulum-v0

HalfCheetah-v3

Swimmer-v3

Hopper-v3

Walker2d-v3

reference

todo

About

Topics

Resources

Stars

Watchers

Forks

Languages