Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
-
Updated
May 24, 2023 - Python
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Vector search for humans.
Open Source Routing Engine for OpenStreetMap
ModelScope: bring the notion of Model-as-a-Service to life.
An Open Toolkit for Knowledge Graph Extraction and Construction published at EMNLP2022 System Demonstrations.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Recent Transformer-based CV and related works.
[pip install medmnist] 18 MNIST-like Datasets for 2D and 3D Biomedical Image Classification
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
FarmVibes.AI: Multi-Modal GeoSpatial ML Models for Agriculture and Sustainability
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
Official implementation of the paper "You Only Need Adversarial Supervision for Semantic Image Synthesis" (ICLR 2021)
[ECCV 2020 Spotlight] A Simple and Versatile Framework for Image-to-Image Translation
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Efficient Retrieval Augmentation and Generation Framework
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.
Robust robotic localization and mapping, together with www.NavAbility.io. Reach out to info@navability.io for help.
code for paper "Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion" in the conference of IJCAI 2021
Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."