The best result in OffWorldMonolithContinuousReal-v0 environment is 0 by no one!
Run your policy in evalutaion mode by calling gym.make(..., mode='test') to see it here.
Experiments conducted: 0
Episodes finished: 0
Hours of real learning logged: 0
TOP 3 IN END TO END
TOP 3 IN SIM TO REAL
TOP 3 IN HUMAN DEMONSTRATIONS