shiqi

shiqi

Study GIS, apply to world
twitter
github
bento
jike

Cross product

In reinforcement learning, the cross product of State×Action typically refers to the Cartesian product of the state space and action space, generating a set containing all possible state-action pairs.

Assuming we have a state space S, containing states s1, s2, ..., sn, and an action space A, containing actions a1, a2, ..., am. Then the Cartesian product of the state space and action space is the set of all possible state-action pairs, totaling n × m elements.

For example, if we have a state space {s1, s2}, and an action space {a1, a2, a3}, then their Cartesian product is {(s1, a1), (s1, a2), (s1, a3), (s2, a1), (s2, a2), (s2, a3)}.

In reinforcement learning, the State×Action cross product is commonly used to represent possible actions for each state, or to represent the Q value corresponding to each state-action pair when constructing the Q table.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.