ggerganov

🦙

Georgi Gerganov ggerganov

🦙

I like big .vimrc and I cannot lie

11.7k followers · 13 following

@ggml-org
Sofia, Bulgaria
14:48 (UTC +02:00)
https://ggerganov.com
@ggerganov

Achievements

x4 x4 x3 x4

BetaSend feedback

Achievements

x4 x4 x3 x4

BetaSend feedback

Organizations

Block or Report

Block or report ggerganov

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

llama.cpp llama.cpp Public

Port of Facebook's LLaMA model in C/C++

C++ 51.4k 7.2k
whisper.cpp whisper.cpp Public

Port of OpenAI's Whisper model in C/C++

C 28.2k 2.7k
kbd-audio kbd-audio Public

🎤⌨️ Acoustic keyboard eavesdropping

C++ 8.3k 577
ggml ggml Public

Tensor library for machine learning

C 9k 811
imtui imtui Public

ImTui: Immediate Mode Text-based User Interface C++ Library

C++ 2.7k 111
wave-share wave-share Public

Serverless, peer-to-peer, local file sharing through sound

C++ 2.1k 128

3,728 contributions in the last year

Learn how we count contributions

Less

February 2024

Created 141 commits in 4 repositories

Created 1 repository

ggerganov/bert.cpp C++
This contribution was made on Feb 3

Created a pull request in ggerganov/llama.cpp that received 26 comments

Feb 14

ggml : add ALiBi support for ggml_soft_max_ext

ref: #3470 CPU Metal CUDA Vulkan (cc @0cc4m) SYCL (cc @abhilash1910) Kompute (cc @cebtenzzre) Missing backends currently generate compile w…

+344 −353 lines changed • 26 comments

Opened 32 other pull requests in 5 repositories

ggerganov/llama.cpp 2 open 20 merged 1 closed

llama : switch to floating-point token positions
This contribution was made on Feb 23
py : minor fixes
This contribution was made on Feb 22
ggml : always define ggml_fp16_t as uint16_t
This contribution was made on Feb 22
gemma : use more bits for the token_embd.weight tensor
This contribution was made on Feb 21
py : add Gemma conversion from HF models
This contribution was made on Feb 21
sync : ggml
This contribution was made on Feb 21
sync : ggml
This contribution was made on Feb 19
cmake : pass -Werror through -Xcompiler
This contribution was made on Feb 19
ci : fix wikitext url + compile warnings
This contribution was made on Feb 18
llama : rename n_ctx to kv_size
This contribution was made on Feb 18
nix: update flake.lock
This contribution was made on Feb 18
cmake : try to fix Android build
This contribution was made on Feb 16
cmake : fix VULKAN and ROCm builds
This contribution was made on Feb 16
scripts : add helpers script for bench comparing commits
This contribution was made on Feb 16
scripts : add hf.sh helper script
This contribution was made on Feb 15
bert : add tests + fix quantization
This contribution was made on Feb 13
tests : multi-thread the tokenizer tests
This contribution was made on Feb 13
tests : disable moe test
This contribution was made on Feb 13
swift : package no longer use ggml dependency
This contribution was made on Feb 12
sync : ggml
This contribution was made on Feb 11
nix: update flake.lock
This contribution was made on Feb 11
py : handle byte tokens in get_token_type
This contribution was made on Feb 5
nix: update flake.lock
This contribution was made on Feb 4

ggerganov/whisper.cpp 3 merged 1 closed

ggml : 32-bit arm compat
This contribution was made on Feb 22
swift : package no longer use ggml dependency
This contribution was made on Feb 12
whisper : fix external encoder
This contribution was made on Feb 12
whisper : fix usage of extenral encoders (e.g. CoreML)
This contribution was made on Feb 12

ggerganov/ggml 3 merged

sync : llama.cpp
This contribution was made on Feb 21
sync : llama.cpp
This contribution was made on Feb 19
examples : remove old stuff
This contribution was made on Feb 10

iamlemec/bert.cpp 1 merged

bert : various improvements
This contribution was made on Feb 3

Pints-App/llama.cpp 1 merged

cuda : fix flash_attn kernel to produce same results as CPU
This contribution was made on Feb 1

Reviewed 137 pull requests in 4 repositories

ggerganov/llama.cpp 25 pull requests

mpt : do not duplicate token_embd.weight on disk
This contribution was made on Feb 22
Add Gemma chat template
This contribution was made on Feb 22
server: init functional tests
This contribution was made on Feb 22
Server: fallback to chatml, add AlphaMonarch chat template
This contribution was made on Feb 22
server: clarify some params in the docs
This contribution was made on Feb 22
[SYCL] Add support for soft_max ALiBi
This contribution was made on Feb 22
MPT: add optional bias parameters
This contribution was made on Feb 22
Add docs for llama_chat_apply_template
This contribution was made on Feb 21
llama : fix session save/load with quantized KV
This contribution was made on Feb 21
gemma : allow offloading the output tensor
This contribution was made on Feb 21
readme: add LocalAI to the availables UI
This contribution was made on Feb 21
server: health: fix race condition on slots data using tasks queue
This contribution was made on Feb 21
llava: add --skip-unknown to 1.6 convert.py
This contribution was made on Feb 21
Add gemma model
This contribution was made on Feb 21
Attempt to fix pre-tokenizer
This contribution was made on Feb 21
examples : do not assume BOS when shifting context
This contribution was made on Feb 21
support llava 1.6 image embedding dimension in server
This contribution was made on Feb 20
llava: add explicit instructions for llava-1.6
This contribution was made on Feb 20
IQ4_NL: 4-bit non-linear quants with blocks of 32
This contribution was made on Feb 20
Server: use llama_chat_apply_template
This contribution was made on Feb 20
Add maid to UI list
This contribution was made on Feb 20
metal : add build system support for embedded metal library
This contribution was made on Feb 20
server: health endpoint configurable failure on no slot
This contribution was made on Feb 20
examples : support minLength and maxLength in JSON schema grammar converter
This contribution was made on Feb 19
Rm obsolete warning build options sycl cmake
This contribution was made on Feb 19
Some pull request reviews not shown.

ggerganov/whisper.cpp 17 pull requests

Add SYCL logic in whisper
This contribution was made on Feb 23
openvino : fix convert-whisper-to-openvino.py for v2023.0.0 (#1870)
This contribution was made on Feb 22
main : fix file existence check in main.cpp
This contribution was made on Feb 22
examples: refine the Android sample code
This contribution was made on Feb 22
talk and talk-llama: Pass text_to_speak as a file
This contribution was made on Feb 19
main : check if input files exist before proceeding
This contribution was made on Feb 19
clean up common code in examples
This contribution was made on Feb 19
Fix the decoding issues series: BPE Tokenizer
This contribution was made on Feb 19
whisper : fix usage of extenral encoders (e.g. CoreML)
This contribution was made on Feb 12
added audio_ctx argument to main and server examples
This contribution was made on Feb 11
Embed Metal library source into compiled binary
This contribution was made on Feb 11
server: Allow CORS request with authorization headers
This contribution was made on Feb 9
Expose CUDA device setting in public API
This contribution was made on Feb 9
Add macOS deployment target option to Makefile
This contribution was made on Feb 9
Fix the decoding issues
This contribution was made on Feb 6
WIP Very rough cut of streaming from stdin.
This contribution was made on Feb 5
whisper.android: How to build with CLBlast
This contribution was made on Feb 5

ggerganov/ggml 7 pull requests

Introduce backend GUIDs
This contribution was made on Feb 22
refactored compute forward to not pass in the src tensors each time
This contribution was made on Feb 21
fix the conv_2d batch mode
This contribution was made on Feb 20
ggml-alloc : allocate all leafs as if they were inputs
This contribution was made on Feb 12
ggml-alloc v3
This contribution was made on Feb 11
A way to use abort_callback with the cpu backend
This contribution was made on Feb 9
[WIP, don't merge] unity.cpp -> ggml master
This contribution was made on Feb 2

Pints-App/llama.cpp 1 pull request

WIP: Flash Attention implementation (forward + backward)
This contribution was made on Feb 11

Created an issue in ggerganov/ggml that received 2 comments

Feb 5

ggml : simplify the ggml_compute_forward_ calls

There is no need to explicitly list all the source tensors - they can be accessed through dst when needed: ggml/src/ggml.c Lines 14783 to 1481…

2 comments

Opened 2 other issues in 2 repositories

ggerganov/llama.cpp 1 open

llama : update the convert-llama2c-to-ggml example
This contribution was made on Feb 20

ggerganov/ggml 1 open

ggml : add Magika inference
This contribution was made on Feb 16

Answered 1 discussion in 1 repository

ggerganov/llama.cpp

Saving state
This contribution was made on Feb 11

Jan	FEB	Mar
	23
2023	2024	2025

	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb
Sun
Mon
Tue
Wed
Thu
Fri
Sat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Georgi Gerganov ggerganov

Sponsors

Achievements

Achievements

Organizations

Block or report ggerganov

Pinned

3,728 contributions in the last year

Contribution activity

February 2024

Created a pull request in ggerganov/llama.cpp that received 26 comments

ggml : add ALiBi support for ggml_soft_max_ext

Created an issue in ggerganov/ggml that received 2 comments

ggml : simplify the ggml_compute_forward_ calls

	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb
Sun
Mon
Tue
Wed
Thu
Fri
Sat

	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Jan	Feb
Sun
Mon
Tue
Wed
Thu
Fri
Sat