The Wayback Machine - https://web.archive.org/web/20251023015947/https://github.com/shibing624
Skip to content
View shibing624's full-sized avatar
🐬
focus
🐬
focus

Organizations

@NLPchina

Block or report shibing624

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shibing624/README.md

Hi there,I'm XuMing 👋


Pinned Loading

  1. pycorrector pycorrector Public

    pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。

    Python 6.2k 1.2k

  2. text2vec text2vec Public

    text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

    Python 4.9k 420

  3. MedicalGPT MedicalGPT Public

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

    Python 4.2k 611

  4. agentica agentica Public

    Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 构建智能的多模态AI Agent。

    Python 214 25

  5. ChatPDF ChatPDF Public

    RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,支持GraphRAG,无须安装任何第三方agent库。

    Python 810 139

  6. imgocr imgocr Public

    Python3 package for Chinese/English OCR,use paddleocr-v5 onnx model(~20MB), with ultra-fast inference speed. 基于ppocr-v5-onnx模型推理,中英文OCR开源SOTA,推理速度超快。

    Python 106 14