Skip to content
Navigation Menu
Toggle navigation
Sign in
Appearance settings
Platform
AI CODE CREATION
GitHub Copilot
Write better code with AI
GitHub Spark
Build and deploy intelligent apps
GitHub Models
Manage and compare prompts
MCP Registry
New
Integrate external tools
DEVELOPER WORKFLOWS
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
APPLICATION SECURITY
GitHub Advanced Security
Find and fix vulnerabilities
Code security
Secure your code as you build
Secret protection
Stop leaks before they start
EXPLORE
Why GitHub
Documentation
Blog
Changelog
Marketplace
View all features
Solutions
BY COMPANY SIZE
Enterprises
Small and medium teams
Startups
Nonprofits
BY USE CASE
App Modernization
DevSecOps
DevOps
CI/CD
View all use cases
BY INDUSTRY
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
EXPLORE BY TOPIC
AI
Software Development
DevOps
Security
View all topics
EXPLORE BY TYPE
Customer stories
Events & webinars
Ebooks & reports
Business insights
GitHub Skills
SUPPORT & SERVICES
Documentation
Customer support
Community forum
Trust center
Partners
Open Source
COMMUNITY
GitHub Sponsors
Fund open source developers
PROGRAMS
Security Lab
Maintainer Community
Accelerator
Archive Program
REPOSITORIES
Topics
Trending
Collections
Enterprise
ENTERPRISE SOLUTIONS
Enterprise platform
AI-powered developer platform
AVAILABLE ADD-ONS
GitHub Advanced Security
Enterprise-grade security features
Copilot for Business
Enterprise-grade AI features
Premium Support
Enterprise-grade 24/7 support
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search syntax tips
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign in
Sign up
Appearance settings
Resetting focus
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
huggingface
/
text-generation-inference
Public
Notifications
You must be signed in to change notification settings
Fork
1.2k
Star
10.7k
Code
Issues
281
Pull requests
37
Discussions
Actions
Security
Uh oh!
There was an error while loading.
Please reload this page
.
Insights
Additional navigation options
Code
Issues
Pull requests
Discussions
Actions
Security
Insights
Issues
Search Issues
is
:
issue
state
:
open
is:issue state:open
Search
Labels
Milestones
New issue
Search results
Open
Closed
FlashAttention CUDA "no kernel image" crash on RTX 5060 Ti
Status: Open.
#3342
In huggingface/text-generation-inference;
·
pauli31
opened
on Dec 9, 2025
'Qwen2Model' object has no attribute 'model'
Status: Open.
#3335
In huggingface/text-generation-inference;
·
Sunhill666
opened
on Oct 10, 2025
How to use prefix caching
Status: Open.
#3333
In huggingface/text-generation-inference;
·
Noha-Magdy
opened
on Sep 27, 2025
Feature request: Apple MPS flash attention for GGUF
Status: Open.
#3331
In huggingface/text-generation-inference;
·
qdrddr
opened
on Sep 20, 2025
please use transformers latest supper gpt-oss please
Status: Open.
#3328
In huggingface/text-generation-inference;
·
wang824892540
opened
on Sep 12, 2025
Gemma3: CUDA error: an illegal memory access was encountered.
Status: Open.
#3321
In huggingface/text-generation-inference;
·
Behnamhb
opened
on Sep 4, 2025
ghcr.io/huggingface/text-generation-inference:3.3.5 doesn't exist
Status: Open.
#3320
In huggingface/text-generation-inference;
·
chuijh
opened
on Sep 3, 2025
Infinite tool call loop: <code>HuggingFaceModel</code> and <code>text-generation-inference</code>
Status: Open.
#3318
In huggingface/text-generation-inference;
·
baughmann
opened
on Aug 31, 2025
Endpoint failed to start due to ShardFailed on hugging face inference endpoint
Status: Open.
#3317
In huggingface/text-generation-inference;
·
relativity-codes
opened
on Aug 29, 2025
It seems no one is maintaining this project.
Status: Open.
#3316
In huggingface/text-generation-inference;
·
mengxiaosen-patsnap
opened
on Aug 29, 2025
Support for gpt-oss-120b and gpt-oss-20b model.
Status: Open.
#3309
In huggingface/text-generation-inference;
·
imran3180
opened
on Aug 5, 2025
ModuleNotFoundError: No module named 'punica_sgmv'
Status: Open.
#3306
In huggingface/text-generation-inference;
·
xxz7909
opened
on Aug 1, 2025
You can’t perform that action at this time.