The Wayback Machine - https://web.archive.org/web/20230816144745/https://github.com/topics/corpus
Here are
794 public repositories
matching this topic...
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Updated
Jul 24, 2023
JavaScript
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Updated
Jun 2, 2023
Python
Updated
Nov 21, 2022
Python
Deep Learning and deep reinforcement learning research papers and some codes
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Updated
Oct 25, 2019
Python
Awesome Chatbot Projects,Corpus,Papers,Tutorials.Chinese Chatbot =>:
Updated
Jul 21, 2022
Python
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Updated
Sep 23, 2020
Python
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Updated
Aug 12, 2023
Python
A multilingual dialog corpus
Updated
Jan 1, 2023
Python
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Updated
Jul 6, 2023
Python
Chatbot in 200 lines of code using TensorLayer
Updated
Oct 5, 2021
Python
Collections of Chinese NLP corpus
Updated
Dec 28, 2020
Python
An R package for the Quantitative Analysis of Textual Data
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Updated
Jul 8, 2020
Python
Updated
Jul 14, 2023
Python
Updated
Mar 7, 2023
Python
Improve this page
Add a description, image, and links to the
corpus
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
corpus
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.