hughdbrown (13) [Avatar] Offline
#1
Instead of finding the length of a list, generating a random number within that length, and then indexing into the list to select a random element:


class RandomDecisionPolicy(DecisionPolicy):
    def __init__(self, actions):
        self.actions = actions

    def select_action(self, current_state):
        action = self.actions[random.randint(0, len(self.actions) - 1)]
        return action


just use random.choice:

class RandomDecisionPolicy(DecisionPolicy):
    def __init__(self, actions):
        self.actions = actions

    def select_action(self, current_state):
        return random.choice(self.actions)


Among other reasons, it is clearer (in my view) and generates much shorter byte code:
>>> import dis
>>> import random
>>> def x(foo):
...     return foo[random.randint(0, len(foo) - 1)]
...
>>> def z(foo):
...     return random.choice(foo)
...
>>> dis.dis(x)
  2           0 LOAD_FAST                0 (foo)
              3 LOAD_GLOBAL              0 (random)
              6 LOAD_ATTR                1 (randint)
              9 LOAD_CONST               1 (0)
             12 LOAD_GLOBAL              2 (len)
             15 LOAD_FAST                0 (foo)
             18 CALL_FUNCTION            1
             21 LOAD_CONST               2 (1)
             24 BINARY_SUBTRACT
             25 CALL_FUNCTION            2
             28 BINARY_SUBSCR
             29 RETURN_VALUE

>>> dis.dis(z)
  2           0 LOAD_GLOBAL              0 (random)
              3 LOAD_ATTR                1 (choice)
              6 LOAD_FAST                0 (foo)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE
Nishant Shukla (44) [Avatar] Offline
#2
Great idea! I've updated the code in the book as well as on GitHub: https://github.com/BinRoot/TensorFlow-Book/blob/master/ch08_rl/Concept01_rl.ipynb