Skip to content
Navigation Menu
Toggle navigation
Sign in
Appearance settings
Product
GitHub Copilot
Write better code with AI
GitHub Models
New
Manage and compare prompts
GitHub Advanced Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
Why GitHub
All features
Documentation
GitHub Skills
Blog
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
Events & Webinars
Ebooks & Whitepapers
Customer Stories
Partners
Executive Insights
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Enterprise
Enterprise platform
AI-powered developer platform
Available add-ons
GitHub Advanced Security
Enterprise-grade security features
Copilot for business
Enterprise-grade AI features
Premium Support
Enterprise-grade 24/7 support
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search syntax tips
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign in
Sign up
Appearance settings
Resetting focus
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
ggml-org
/
llama.cpp
Public
Notifications
You must be signed in to change notification settings
Fork
12.2k
Star
82.2k
Code
Issues
298
Pull requests
484
Discussions
Actions
Projects
10
Wiki
Security
Uh oh!
There was an error while loading.
Please reload this page
.
Insights
Additional navigation options
Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights
Commits
Branch selector
gg/kv-cache-use-set-rows
User selector
All users
Datepicker
All time
Commit History
Commits on Jun 23, 2025
kv-cache : utilize ggml_set_rows broadcast
Show description for 36f8e20
ggerganov
committed
36f8e20
Copy full SHA for 36f8e20
cont : support non-continuous slots
Show description for 332f073
ggerganov
committed
332f073
Copy full SHA for 332f073
cont : kv-cells cp/set for non-cont slots
Show description for 39d0b1e
ggerganov
committed
39d0b1e
Copy full SHA for 39d0b1e
cont : migrate to using set of indices instead of slot head
Show description for f875d6c
ggerganov
committed
f875d6c
Copy full SHA for f875d6c
cont : gate the ggml_set_rows usage with env var
Show description for db2bb37
ggerganov
committed
db2bb37
Copy full SHA for db2bb37
kv-cache : use ggml_set_rows
Show description for 79dac3c
ggerganov
committed
79dac3c
Copy full SHA for 79dac3c
ggml : fix supports_op
rgerganov
authored and
ggerganov
committed
1f647b5
Copy full SHA for 1f647b5
ggml : simplify forward_dup_f32
rgerganov
authored and
ggerganov
committed
eba9757
Copy full SHA for eba9757
metal : add ggml_set_rows implementation
Show description for c0cfc2f
ggerganov
committed
c0cfc2f
Copy full SHA for c0cfc2f
tests : add ggml_set_rows
ggerganov
committed
828e5d2
Copy full SHA for 828e5d2
ggml : ggml_set_rows update comment + better index name
ggerganov
committed
e73690a
Copy full SHA for e73690a
ggml : support GGML_TYPE_F32 ".from_float" trait
ggerganov
committed
e897097
Copy full SHA for e897097
ggml : ggml_set_rows support quantized dst
Show description for 630c84a
ggerganov
committed
630c84a
Copy full SHA for 630c84a
ggml : ggml_set_rows support broadcast
ggerganov
committed
df71c80
Copy full SHA for df71c80
ggml : add ggml_is_contiguous_rows
ggerganov
committed
313a444
Copy full SHA for 313a444
ggml : add repeat impl for i64
ggerganov
committed
695b6b7
Copy full SHA for 695b6b7
use I64 for indices
rgerganov
authored and
ggerganov
committed
f2cd962
Copy full SHA for f2cd962
ggml : add ggml_set_rows
Show description for c1a581a
rgerganov
authored and
ggerganov
committed
c1a581a
Copy full SHA for c1a581a
kv-cells : fix tracking of seq_pos (#14339)
Show description for 7b50d58
ggerganov
authored
7b50d58
Copy full SHA for 7b50d58
vulkan: update windows SDK in CI (#14334)
jeffbolznv
authored
3a9457d
Copy full SHA for 3a9457d
Commits on Jun 22, 2025
quantize : handle user-defined pruning of whole layers (blocks) (#13037)
EAddario
authored
fa4a9f2
Copy full SHA for fa4a9f2
gguf-py : fix SpecialVocab parsing when post_processor is null (#14330)
CISC
authored
238005c
Copy full SHA for 238005c
run : avoid double tokenization (#14327)
Show description for 66aba7a
retr0reg
authored
66aba7a
Copy full SHA for 66aba7a
examples : fix is_first logic for tokenization (#14329)
Show description for f1f5e82
ggerganov
authored
f1f5e82
Copy full SHA for f1f5e82
HIP: enable vec fattn on RDNA4 (#14323)
IMbackK
authored
af3373f
Copy full SHA for af3373f
mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326)
Show description for 5d5c066
yuiseki
authored
5d5c066
Copy full SHA for 5d5c066
common : use std::string_view now that we target c++17 (#14319)
CISC
authored
40bfa04
Copy full SHA for 40bfa04
CUDA: add mean operation (#14313)
Show description for aa064b2
am17an
authored
aa064b2
Copy full SHA for aa064b2
Commits on Jun 21, 2025
gguf-py : fix Qwen3-Embedding eos token (#14314)
CISC
authored
aa0ef5c
Copy full SHA for aa0ef5c
Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (#13792)
Show description for bb16041
mtavenrath
authored
bb16041
Copy full SHA for bb16041
gguf-py : fix TemplateProcessing pair when bos/eos is missing (#14312)
CISC
authored
58cba76
Copy full SHA for 58cba76
metal : fix thread-safety (#14300)
Show description for 67ae531
ggerganov
authored
67ae531
Copy full SHA for 67ae531
memory : rename interface to llama_memory_context_i (#14296)
Show description for 692e3cd
ggerganov
authored
692e3cd
Copy full SHA for 692e3cd
convert : fix Llama 4 conversion (#14311)
danielhanchen
authored
b23fa0b
Copy full SHA for b23fa0b
Commits on Jun 20, 2025
sync : ggml
Show description for 06cbedf
ggerganov
committed
06cbedf
Copy full SHA for 06cbedf
Pagination
Previous
Next
You can’t perform that action at this time.