This uses stable-baselines3 btw :3
Here's the Godot addon I used.
Going into invisible walls punishes them (-reward), as does falling down the gap.
Jumping over the gap, howere. rewards twice. once halfway through the Jump and once again on the other side
Coins also reward.
The end goal also gives a HUGE reward.
Since the coins kind of lead towards the goal, they're encouraged to Collect them on the way, and then go to the goal.
This works sometimes, and sometimes it doesn't, probably due to the short training time (15min on 4x Speed), but I'm satisfied with the result, so ye :)
I could probably make things better by punishing a little if they haven't collected all coins and also punishing a tiny bit for each jump so they don't just spam jump and only Jump when they really NEED to. but It's too late for that now :3
These little guys tricked me several times lol. mainly related to abusing rewards.
For example, at one point they just kept Falling into the hole rather than Collecting coins because it gave them more rewards in a shorter amount of time, due to them Colliding with both the halfway through the jump trigger and also the punishment at the bottom, Giving them slightly plus reward in a VERY fast time.
I'm still so mad that I can't Save these Training Models ughhhh I don't even know why, and I'm too lazy to try and fix it or ask the Developers of the addon for help :3
6 comments