#
multimodal
Here are 154 public repositories matching this topic...
A curated list of Multimodal Related Research.
-
Updated
Jul 29, 2021 - Python
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
-
Updated
Nov 10, 2021
CVPR 2019: "Pluralistic Image Completion"
-
Updated
Aug 29, 2021 - Python
The data structure for unstructured data
deep-learning
data-structures
cross-modal
unstructured-data
multimodal
nested-data
neural-search
docarray
-
Updated
Feb 25, 2022 - Python
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
-
Updated
Feb 21, 2022 - Python
Open-AI's DALL-E for large scale training in mesh-tensorflow.
transformers
artificial-intelligence
autoregressive
text-to-image
variational-autoencoder
multimodal
-
Updated
Feb 12, 2022 - Python
Platform for Situated Intelligence
streaming
framework
pipelines
artificial-intelligence
stream-processing
perception
component-library
human-robot-interaction
multimodal-interactions
multimodal
-
Updated
Feb 17, 2022 - C#
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
deep-learning
artificial-intelligence
openai
image-generation
multimodality
text-to-image
diffusion
multimodal
text-to-image-synthesis
openai-clip
-
Updated
Feb 8, 2022 - Python
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
search
retrieval
ranking
clip
multimodality
multimodal-learning
multimodal
activitynet
retrieval-model
msvd
msrvtt
video-text-retrieval
lsmdc
didemo
video-clip-retrieval
-
Updated
Dec 2, 2021 - Python
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
image-captioning
visual-question-answering
multimodal
text-to-image-synthesis
pretraining
referring-expression-comprehension
vision-and-language-pre-training
-
Updated
Feb 25, 2022 - Python
Easily compute clip embeddings and build a clip retrieval system with them
-
Updated
Feb 26, 2022 - Jupyter Notebook
KDD Cup 2020 Challenges for Modern E-Commerce Platform: Multimodalities Recall first place
-
Updated
Jul 22, 2020 - Jupyter Notebook
FairyTailor: Multimodal Generative Framework for Storytelling
-
Updated
Apr 25, 2021 - Python
EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection(ECCV 2020)
-
Updated
Aug 25, 2020 - Python
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"
tensorflow
attention
generative-adversarial-networks
inpainting
multimodal
vq-vae
autoregressive-neural-networks
-
Updated
Jul 11, 2021 - Python
Flexible time series feature extraction & processing
python
processing
data-science
time-series
pandas
feature-extraction
multivariate
feature-engineering
multimodal
window-stride
-
Updated
Feb 17, 2022 - Python
Fusing Histology and Genomics via Deep Learning - IEEE TMI
genomics
fusion
transcriptomics
pathology
multimodal
histopathology
computational-pathogenomics
pathomic
multimodal-network
mahmoodlab
-
Updated
Nov 2, 2021 - Jupyter Notebook
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
deep-learning
cnn
pytorch
multi-modal
image-registration
affine-transformation
stn
image-to-image-translation
multimodal
deformable-transformation
multi-modal-learning
cvpr2020
registartion
multimodal-image-registration
-
Updated
Aug 2, 2020 - Python
-
Updated
Feb 9, 2022 - Python
-
Updated
Oct 6, 2020
第五届百度西安交大大数据竞赛 城市区域功能分类 Baseline
-
Updated
Dec 16, 2021 - Jupyter Notebook
Neural Machine Translation with universal Visual Representation (ICLR 2020)
-
Updated
Jul 1, 2020 - Python
Robust multimodal integration method implemented in PyTorch and TensorFlow
-
Updated
Mar 5, 2021 - Python
Tensorflow implementation of "Deep Multimodal Subspace Clustering Networks"
-
Updated
May 10, 2019 - Python
Graph Distillation for Action Detection
-
Updated
Jul 15, 2019 - Python
RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR2021)
-
Updated
Jan 21, 2022 - Python
Multi-modal speech separation task data generation script on LRS3 data set.
-
Updated
Jul 29, 2020 - MATLAB
ADvISER is a flexible framework to encourage task-oriented dialog system research & development
machine-learning
framework
reinforcement-learning
toolkit
dialogue
dialogue-systems
task-oriented-dialogue
multimodal
-
Updated
Feb 11, 2022 - Python
Improve this page
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."


File "/home/ubuntu/vqa/GMN/mmf/mmf/datasets/builders/visual_genome/dataset.py", line 44, in init
scene_graph_file = self._get_absolute_path(scene_graph_file)
AttributeError: 'VisualGenomeDataset' object has no attribute '_get_absolute_path'
Command that i run in shell
CUDA_VISIBLE_DEVICES="0" mmf_run config=projects/gmn/configs/visual_genome/defaults.yaml model=gm