Here we define some useful dataset classes for converting data from buffers of NumPy arrays to PyTorch Datasets.
PolicyGradientRLDataset
(data
) :: Dataset
A dataset for policy gradient RL algorithms.
It returns a tuple of (state, action, advantage, reward, action_logp) at each index.
Args:
- data (NumPy array): Batch of interaction data to train on.
QPolicyGradientRLDataset
(data
) :: Dataset
A dataset for Q policy gradient algorithms.
It returns a tuple of (state, next_state, action, reward, done).
Args:
- data (NumPy array): Numpy array of data to train on.