VisualQuestionAnsweringInferencer¶

class mmpretrain.apis.VisualQuestionAnsweringInferencer(model, pretrained=True, device=None, device_map=None, offload_folder=None, **kwargs)[source]¶

The inferencer for visual question answering.

Parameters:

model (BaseModel | str | Config) – A model name or a path to the config file, or a BaseModel object. The model name can be found by VisualQuestionAnsweringInferencer.list_models() and you can also query it in 模型库统计.
pretrained (str, optional) – Path to the checkpoint. If None, it will try to find a pre-defined weight from the model you specified (only work if the model is a model name). Defaults to None.
device (str, optional) – Device to run inference. If None, the available device will be automatically used. Defaults to None.
**kwargs – Other keyword arguments to initialize the model (only work if the model is a model name).

Example

>>> from mmpretrain import VisualQuestionAnsweringInferencer
>>> inferencer = VisualQuestionAnsweringInferencer('ofa-base_3rdparty-zeroshot_vqa')
>>> inferencer('demo/cat-dog.png', "What's the animal next to the dog?")[0]
{'question': "What's the animal next to the dog?", 'pred_answer': 'cat'}

__call__(images, questions, return_datasamples=False, batch_size=1, objects=None, **kwargs)[source]¶

Call the inferencer.

Parameters:

images (str | array | list) – The image path or array, or a list of images.
questions (str | list) – The question to the correspondding image.
return_datasamples (bool) – Whether to return results as DataSample. Defaults to False.
batch_size (int) – Batch size. Defaults to 1.
objects (List[List[str]], optional) – Some algorithms like OFA fine-tuned VQA models requires extra object description list for every image. Defaults to None.
resize (int, optional) – Resize the short edge of the image to the specified length before visualization. Defaults to None.
show (bool) – Whether to display the visualization result in a window. Defaults to False.
wait_time (float) – The display time (s). Defaults to 0, which means “forever”.
show_dir (str, optional) – If not None, save the visualization results in the specified directory. Defaults to None.

Returns:

The inference results.

Return type:

list

static list_models(pattern=None)[source]¶

List all available model names.

Parameters:: pattern (str | None) – A wildcard pattern to match model names.
Returns:: a list of model names.
Return type:: List[str]