Skip to content

ggml : do not output unprintable characters on GGUF load failure #14381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2025

Conversation

CISC
Copy link
Collaborator

@CISC CISC commented Jun 25, 2025

On failure to load a GGUF the first four characters of the file will be output to the console, leading to all kinds of trouble depending on the characters.

This is especially important to resolve before starting to support GGUF for other kinds of data files, see #9400 (review)

@CISC CISC requested a review from JohannesGaessler as a code owner June 25, 2025 19:20
@CISC CISC requested review from ggerganov and removed request for JohannesGaessler June 25, 2025 19:20
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 25, 2025
@JohannesGaessler
Copy link
Collaborator

It would maybe be nice to also add a test case to tests/test-gguf.cpp but I think this is a fairly minor issue.

@CISC
Copy link
Collaborator Author

CISC commented Jun 25, 2025

It would maybe be nice to also add a test case to tests/test-gguf.cpp but I think this is a fairly minor issue.

Test for what though?

@CISC
Copy link
Collaborator Author

CISC commented Jun 25, 2025

It would maybe be nice to also add a test case to tests/test-gguf.cpp but I think this is a fairly minor issue.

Test for what though?

Could capture stderr I guess?

@JohannesGaessler
Copy link
Collaborator

I don't know, would depend on what kind of issues you were able to provoke without the change. You could maybe assert that with unprintable characters in the gguf magic the string ???? is printed instead but I don't know how tedious that would be to implement vs. the gain in test coverage.

@CISC
Copy link
Collaborator Author

CISC commented Jun 25, 2025

I tried, but since GGML_LOG writes to stderr by default it got too messy (would require a logging callback or freopen), I don't think it's worth it.

@JohannesGaessler
Copy link
Collaborator

That's fine, as long as it's not possible to provoke major issues such as a crash it's not that important anyways.

@CISC CISC merged commit b193d53 into master Jun 25, 2025
48 checks passed
@CISC CISC deleted the cisc/fix-unprintable-magic branch June 25, 2025 21:26
@CISC
Copy link
Collaborator Author

CISC commented Jun 25, 2025

That's fine, as long as it's not possible to provoke major issues such as a crash it's not that important anyways.

No, at worst it will mess up your terminal on the right character sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
3 participants