SAM¶

Segment Anything

摘要¶

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billionmasks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive – often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at https://segment-anything.com to foster research into foundation models for computer vision.

使用方式¶

import torch
from mmpretrain import get_model

model = get_model('vit-base-p16_sam-pre_3rdparty_sa1b-1024px', pretrained=True)
inputs = torch.rand(1, 3, 1024, 1024)
out = model(inputs)
print(type(out))
# To extract features.
feats = model.extract_feat(inputs)
print(type(feats))

Models and results¶

Pretrained models¶

模型	Params (M)	Flops (G)	配置文件	下载
`vit-base-p16_sam-pre_3rdparty_sa1b-1024px`*	89.67	486.00	config	model
`vit-large-p16_sam-pre_3rdparty_sa1b-1024px`*	308.00	1494.00	config	model
`vit-huge-p16_sam-pre_3rdparty_sa1b-1024px`*	637.00	2982.00	config	model

Models with * are converted from the official repo. The config files of these models are only for inference. We haven’t reproduce the training results.

引用¶

@article{kirillov2023segany,
  title={Segment Anything},
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}