PyTorch Dataloader for HDF5 data — Vict0rsch
vict0rs.ch › 2021/06/15 › pytorch-h5Jun 15, 2021 · The solution is to lazy-load the files: load them the first time they are needed and store them after the first call: import torch from torch.utils.data import Dataset import h5py class H5Dataset (Dataset): def __init__ (self, h5_paths, limit =-1): self. limit = limit self. h5_paths = h5_paths self. _archives = [h5py.
Any tricks to lazily load dataset when creating Iterator ...
github.com › pytorch › textNov 14, 2017 · The original trick for a lazy iterator was to make the Dataset a Python generator (implemented however you want) and make sure to use an Iterator without (global) shuffling or sorting (so for instance BucketIterator with sort=False and shuffle=False ). Then what will happen is that the BucketIterator or user equivalent will prefetch some number ...