Shortcuts

Adding New Dataset

You can write a new dataset class inherited from BaseDataset, and overwrite load_data_list(self), like CIFAR10 and ImageNet. Typically, this function returns a list, where each sample is a dict, containing necessary data information, e.g., img and gt_label.

Assume we are going to implement a Filelist dataset, which takes filelists for both training and testing. The format of annotation list is as follows:

000001.jpg 0
000002.jpg 1

1. Create Dataset Class

We can create a new dataset in mmpretrain/datasets/filelist.py to load the data.

from mmpretrain.registry import DATASETS
from .base_dataset import BaseDataset


@DATASETS.register_module()
class Filelist(BaseDataset):

    def load_data_list(self):
        assert isinstance(self.ann_file, str),

        data_list = []
        with open(self.ann_file) as f:
            samples = [x.strip().split(' ') for x in f.readlines()]
            for filename, gt_label in samples:
                img_path = add_prefix(filename, self.img_prefix)
                info = {'img_path': img_path, 'gt_label': int(gt_label)}
                data_list.append(info)
        return data_list

2. Add to the package

And add this dataset class in mmpretrain/datasets/__init__.py

from .base_dataset import BaseDataset
...
from .filelist import Filelist

__all__ = [
    'BaseDataset', ... ,'Filelist'
]
Read the Docs v: latest
Versions
latest
stable
mmcls-1.x
mmcls-0.x
dev
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.