Skip to content

Navigation Menu

Appearance settings

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search syntax tips

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 12.2k
Star 82.2k

Code
Issues 298
Pull requests 484
Discussions
Actions
Projects 10
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Commits

Branch selector

User selector

Datepicker

Commit History

Commits on Jun 23, 2025

kv-cache : utilize ggml_set_rows broadcast
ggerganov
committed
cont : support non-continuous slots
ggerganov
committed
cont : kv-cells cp/set for non-cont slots
ggerganov
committed
cont : migrate to using set of indices instead of slot head
ggerganov
committed
cont : gate the ggml_set_rows usage with env var
ggerganov
committed
kv-cache : use ggml_set_rows
ggerganov
committed
ggml : fix supports_op

rgerganov
authored and
ggerganov
committed
ggml : simplify forward_dup_f32

rgerganov
authored and
ggerganov
committed
metal : add ggml_set_rows implementation
ggerganov
committed
tests : add ggml_set_rows
ggerganov
committed
ggml : ggml_set_rows update comment + better index name
ggerganov
committed
ggml : support GGML_TYPE_F32 ".from_float" trait
ggerganov
committed
ggml : ggml_set_rows support quantized dst
ggerganov
committed
ggml : ggml_set_rows support broadcast
ggerganov
committed
ggml : add ggml_is_contiguous_rows
ggerganov
committed
ggml : add repeat impl for i64
ggerganov
committed
use I64 for indices

rgerganov
authored and
ggerganov
committed
ggml : add ggml_set_rows

rgerganov
authored and
ggerganov
committed
kv-cells : fix tracking of seq_pos (#14339)
ggerganov
authored
vulkan: update windows SDK in CI (#14334)
jeffbolznv
authored

Commits on Jun 22, 2025

quantize : handle user-defined pruning of whole layers (blocks) (#13037)
EAddario
authored
gguf-py : fix SpecialVocab parsing when post_processor is null (#14330)
CISC
authored
run : avoid double tokenization (#14327)
retr0reg
authored
examples : fix is_first logic for tokenization (#14329)
ggerganov
authored
HIP: enable vec fattn on RDNA4 (#14323)
IMbackK
authored
mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326)
yuiseki
authored
common : use std::string_view now that we target c++17 (#14319)
CISC
authored
CUDA: add mean operation (#14313)
am17an
authored

Commits on Jun 21, 2025

gguf-py : fix Qwen3-Embedding eos token (#14314)
CISC
authored
Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (#13792)
mtavenrath
authored
gguf-py : fix TemplateProcessing pair when bos/eos is missing (#14312)
CISC
authored
metal : fix thread-safety (#14300)
ggerganov
authored
memory : rename interface to llama_memory_context_i (#14296)
ggerganov
authored
convert : fix Llama 4 conversion (#14311)
danielhanchen
authored

Commits on Jun 20, 2025

sync : ggml
ggerganov
committed

Pagination

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.