This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
-
Updated
Feb 21, 2023 - Python
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
CVNets: A library for training computer vision networks
Object detection with multi-level representations generated from deep high-resolution representation learning (HRNetV2h). This is an official implementation for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
This is an official implementation for "Contextual Transformer Networks for Visual Recognition".
This repository contains the source code of our work on designing efficient CNNs for computer vision
VarifocalNet: An IoU-aware Dense Object Detector
SWA Object Detection
Video Platform for Action Recognition and Object Detection in Pytorch
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
Boundary-preserving Mask R-CNN (ECCV 2020)
High-resolution Networks for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm
Semantic Propositional Image Caption Evaluation
Official ImageNet Model repository
A tensorflow implement mobilenetv3 centernet, which can be easily deployeed on android(MNN) and ios(CoreML).
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
Adds SPICE metric to coco-caption evaluation server codes
Visually informed embedding of word (VIEW) is a tool for transferring multimodal background knowledge to NLP algorithms.
Add a description, image, and links to the mscoco topic page so that developers can more easily learn about it.
To associate your repository with the mscoco topic, visit your repo's landing page and select "manage topics."