COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The Wayback Machine - https://web.archive.org/web/20221122113751/https://github.com/topics/dataset
Here are
7,275 public repositories
matching this topic...
A collective list of free APIs
Updated
Nov 22, 2022
Python
Faker is a Python package that generates fake data for you.
Updated
Nov 21, 2022
Python
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Updated
Nov 22, 2022
Python
A MNIST-like fashion product database. Benchmark 👇
Updated
Jun 13, 2022
Python
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Updated
Nov 22, 2022
Open Policy Agent
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Open source annotation tool for machine learning practitioners.
Updated
Nov 22, 2022
Python
Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
Deep learning with satellite & aerial imagery
Documentation on how to access and use the Quick, Draw! Dataset.
This repository contains compatibility data for Web technologies as displayed on MDN
Updated
Nov 22, 2022
JSON
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Updated
Nov 22, 2022
Python
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Data loaders and abstractions for text and NLP
Updated
Nov 21, 2022
Python
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Updated
Nov 21, 2022
Python
We are building an open database of COVID-19 cases with chest X-ray or CT images.
Updated
Aug 18, 2022
Jupyter Notebook
A curated list of awesome JSON datasets that don't require authentication.
Updated
Mar 28, 2022
JavaScript
Extract data from a wide range of Internet sources into a pandas DataFrame.
Updated
Oct 23, 2022
Python
A synthetic data generator for text recognition
Updated
Nov 8, 2022
Python
Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Updated
Apr 22, 2021
Python
Improve this page
Add a description, image, and links to the
dataset
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
dataset
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.