Top Computer Vision Models

Explore state-of-the-art computer vision model architectures, immediately usable for training with your custom dataset.

Deploy select models (i.e. YOLOv8, CLIP) using the Roboflow Hosted API, or your own hardware using Roboflow Inference.

Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Segment Anything (SAM) is an image segmentation model developed by Meta Research, capable of doing zero-shot segmentation. Learn more »
Keypoint Detection
Keypoint Detection
fEATURED
Keypoint Detection
Keypoint Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

The YOLOv8 pose estimation model allows you to detect keypoints in an image. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

74

Architecture:

YOLO

YOLO-World is a zero-shot object detection model. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO, CNN

YOLOv8 is a state-of-the-art object detection and image segmentation model created by Ultralytics, the developers of YOLOv5. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

The state-of-the-art YOLOv8 model comes with support for instance segmentation tasks. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Grounding DINO is a zero-shot object detection model made by combining a Transformer-based DINO detector and grounded pre-training. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO

YOLOv9 is an object detection model architecture released on February 21st, 2024. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

14.1

MB

Parameters:

7.2 million

Top FPS:

140

Architecture:

CNN, YOLO

A very fast and easy to use PyTorch model that achieves state of the art (or near state of the art) results. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

7.5 million

Top FPS:

Architecture:

CNN, YOLO

YOLOv5-OBB is a variant of YOLOv5 that supports oriented bounding boxes. This model is designed to yield predictions that better fit objects that are positioned at an angle. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CNN, YOLO

YOLOv5 Instance Segmentation is a version of YOLOv5 that can be used for instance segmentation tasks. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLOv5 Classification is a version of the YOLOv5 model used in single-label and multi-label image classification. Learn more »
Multimodal Model
Multimodal Model
fEATURED
Multimodal Model
Multimodal Model
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CogVLM shows strong performance in Visual Question Answering (VQA) and other vision tasks. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Detectron2 is model zoo of it's own for computer vision models written in PyTorch. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Mask RCNN is a convolutional neural network for instance segmentation. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLOv4 has emerged as the best real time object detection model. YOLOv4 carries forward many of the research contributions of the YOLO family of models along with new modeling and data augmentation techniques. This implementation is in Darknet. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CLIP (Contrastive Language-Image Pre-Training) is an impressive multimodal zero-shot image classifier that achieves impressive results in a wide range of domains with no fine-tuning. It applies the recent advancements in large-scale transformers like GPT-3 to the vision arena. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Transformers

Detection Transformer (DETR) is an end-to-end object detection model implemented using the Transformer architecture. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLOv7 Instance Segmentation lets you perform segmentation tasks with the YOLOv7 model. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

One of the most accurate object detection algorithms but requires a lot of power at inference time. A good choice if you can do processing asynchronously on a server. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

68.7

MB

Parameters:

9 million

Top FPS:

Architecture:

CNN, YOLO

YOLOX is a high-performance object detection model. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

75.6

MB

Parameters:

Top FPS:

161

Architecture:

YOLO, CNN

YOLOv7 is a state of the art object detection model. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CNN

EfficientNet is from a family of image classification models from GoogleAI that train comparatively quickly on small amounts of data, making the most of limited datasets. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Though it is no longer the most accurate object detection algorithm, YOLO v3 is still a very good choice when you need real-time detection while maintaining excellent accuracy. Keras implementation. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO

Though it is no longer the most accurate object detection algorithm, YOLO v3 is still a very good choice when you need real-time detection while maintaining excellent accuracy. PyTorch version. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

The Vision Transformer leverages powerful natural language processing embeddings (BERT) and applies them to images. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

A simple, fully convolutional model for real-time instance segmentation Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

520

Architecture:

CNN, YOLO

MT-YOLOv6 is a YOLO based model released in 2022. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO

YOLOv4 has emerged as the best real time object detection model. YOLOv4 carries forward many of the research contributions of the YOLO family of models along with new modeling and data augmentation techniques. This implementation is in PyTorch. Learn more »
Keypoint Detection
Keypoint Detection
fEATURED
Keypoint Detection
Keypoint Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO-NAS Pose is a keypoint detection model developed by Deci AI. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

FastSAM is an image segmentation model trained using 2% of the data in the Segment Anything Model SA-1B dataset. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

YOLO

YOLO-NAS is an object detection model developed by Deci that achieves SOTA performances compared to YOLOv5, v7, and v8. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Scaled YOLOv4 is an extension of the YOLOv4 research implemented in the YOLOv5 PyTorch framework. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

This architecture provides good realtime results on limited compute. It's designed to run in realtime (30 frames per second) even on mobile devices. Learn more »
Multimodal Model
Multimodal Model
fEATURED
Multimodal Model
Multimodal Model
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Qwen-VL is an LMM developed by Alibaba Cloud. Qwen-VL accepts images, text, and bounding boxes as inputs. The model can output text and bounding boxes. Qwen-VL naturally supports English, Chinese, and multilingual conversation. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

202.0

MB

Parameters:

12,786,711 (S2D)

Top FPS:

106

Architecture:

CNN, YOLO

YOLOR (You Only Learn One Representation) is an object detection model that uses both implicit and explicit knowledge to make predictions. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Detic is an open source segmentation model developed by Meta Research and released in 2022. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

77 million

Top FPS:

8

Architecture:

A scalable, state of the art object detection model, implemented here within the TensorFlow 2 Object Detection API. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

3.9 million

Top FPS:

97

Architecture:

EfficientDet achieves the best performance in the fewest training epochs among object detection model architectures, making it a highly scalable architecture especially when operating with limited compute. Learn more »
Semantic Segmentation
Semantic Segmentation
fEATURED
Semantic Segmentation
Semantic Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

51

Architecture:

Transformers

SegFormer is a computer vision framework used in semantic segmentation tasks, implemented with transformers. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

An image classification model built using YOLOv8. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CLIP

MetaCLIP is a zero-shot classification and embedding model developed by Meta AI. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Transformer, YOLO

YOLOS looks at patches of an image to to form "patch tokens", which are used in place of the traditional wordpiece tokens in NLP. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

219 million

Top FPS:

Architecture:

Transformers

OneFormer is a state-of-the-art multi-task image segmentation framework that is implemented using transformers. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

460,000

Top FPS:

Architecture:

A fast, simple convolutional neural network that gets the job done for many tasks, including classification. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

MobileNet is a GoogleAI model well-suited for on-device, real-time classification (distinct from MobileNetSSD, Single Shot Detector). This implementation leverages transfer learning from ImageNet to your dataset. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

A fast, simple convolutional neural network that gets the job done for many tasks, including classification. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

34

Architecture:

ResNet-D, YOLO

The tiny and fast version of YOLOv4 - good for training and deployment on limited compute resources, and getting a feel for your dataset Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

ByteTrack is a multi-object tracking computer vision model. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

RTMDet is an efficient real-time object detector, with self-reported metrics outperforming the YOLO series. It achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, making it one of the fastest and most accurate object detectors available as of writing this post. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

DINOv2 is a self-supervised method for training computer vision models developed by Meta Research and released in April 2023. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

DocTR is an Optical Character Recognition tool powered by deep learning. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

L2CS-Net is a gaze estimation model that enables you to calculate where someone is looking and in what direction someone is looking. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Kosmos-2 is a multimodal language model capable of object detection and grounding text in images. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

LLaVA is an open source multimodal language model that you can use for visual question answering and has limited support for object detection. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

OWLv2 is a transformer-based object detection model developed by Google Research. OWLv2 is the successor to OWL ViT. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

OWL-ViT is a transformer-based object detection model developed by Google Research. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

BLIPv2 is a multimodal model developed by Salesforce Research. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Grounding DINO is a state-of-the-art zero-shot object detection model, developed by IDEA Research. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

GPT-4 with Vision is a multimodal language model developed by OpenAI. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

CoDet is an open vocabulary zero-shot object detection model. Learn more »
Object Detection
Object Detection
fEATURED
Object Detection
Object Detection
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

VLPart, developed by Meta Research, is an object detection and segmentation model that works with an open vocabulary Learn more »
Multimodal Model
Multimodal Model
fEATURED
Multimodal Model
Multimodal Model
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

BakLLaVA is an LMM developed by LAION, Ontocord, and Skunkworks AI. BakLLaVA uses a Mistral 7B base augmented with the LLaVA 1.5 architecture. Learn more »
Instance Segmentation
Instance Segmentation
fEATURED
Instance Segmentation
Instance Segmentation
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

SigLIP is an image embedding model defined in the "Sigmoid Loss for Language Image Pre-Training" paper. Learn more »
Classification
Classification
fEATURED
Classification
Classification
Deploy on Device with Roboflow✅

Model Size:

MB

Parameters:

Top FPS:

Architecture:

MobileCLIP is an image embedding model developed by Apple and introduced in the "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" paper Learn more »

Deploy a computer vision model today

Join 250,000+ developers curating high quality datasets and deploying better models with Roboflow.

Get started

Build your computer vision skills

Browse Roboflow Learn for curated learning resources that will help you advance your understanding of computer vision.

Explore Roboflow Learn