Shortcuts

SAM

摘要

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billionmasks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive – often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at https://segment-anything.com to foster research into foundation models for computer vision.

使用方式

import torch
from mmpretrain import get_model

model = get_model('vit-base-p16_sam-pre_3rdparty_sa1b-1024px', pretrained=True)
inputs = torch.rand(1, 3, 1024, 1024)
out = model(inputs)
print(type(out))
# To extract features.
feats = model.extract_feat(inputs)
print(type(feats))

Models and results

Pretrained models

模型

Params (M)

Flops (G)

配置文件

下载

vit-base-p16_sam-pre_3rdparty_sa1b-1024px*

89.67

486.00

config

model

vit-large-p16_sam-pre_3rdparty_sa1b-1024px*

308.00

1494.00

config

model

vit-huge-p16_sam-pre_3rdparty_sa1b-1024px*

637.00

2982.00

config

model

Models with * are converted from the official repo. The config files of these models are only for inference. We haven’t reproduce the training results.

引用

@article{kirillov2023segany,
  title={Segment Anything},
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}
Read the Docs v: latest
Versions
latest
stable
mmcls-1.x
mmcls-0.x
dev
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.