Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jinja] GPT4All-Chat 3.5.1 breaks TheBloke/OpenHermes-2.5-Mistral-7B-GGUF #3281

Closed
UserHub1973 opened this issue Dec 13, 2024 · 2 comments
Closed
Labels
bug-unconfirmed chat gpt4all-chat issues

Comments

@UserHub1973
Copy link

UserHub1973 commented Dec 13, 2024

So out of 12 models that are regularly used by me.

None of them work as of the last update. They all worked wonderfully under 3.4.2 and all are broke under 3.5.1.

Is every non Nomic Model being rendered useless and inoperable, going to be addressed as a FLAW or is it a new feature that I am to be happy and joyous over at all times?

Basically, is there a plan to go back to some kind of 'jinja' based 'default prompt' that just works with any side loaded models? Lawyer talk and nanny gas lighting aside the expected real-world behavior was side loaded models just worked once put into the model folder. Very compatible with many many models of any quant we wanted from 4 to 16 any quant you wanted almost any model. Thousands in fact.

That was the functionality up until 3.5 AFTER ALL. Truth be told. 99% of all Hugging face models just worked out of the box and with very little or NO fiddling, upside, will all those THOUSAND OF LLM models on HUGGINGFACE that made your product popular because they side loaded EASILY, have to stop advertising as working with your product?

TheBloke Models are not compatible with software all the sudden?

TheBloke/OpenHermes-2.5-Mistral-7B-GGUF
Ran script ... Won't just find and install the correct template?

Nope. This is a lot of lost productivity.

The Bloke and folks like him ... MADE YOU ...

Those 'side loaded' models are why many folks use you. We ourselves, only use this because it gave us the EASE and FREEDOM to use any model WE CHOSE on our machine under OUR TERMS. 99/100 AI models in 5 or 6 Archs used to work perfectly with the default out of the box.

If that goes: Your software gets uninstalled. I don't use your curated ones at all daily so them working or not matters not to me in the least. My freedom to use the model I want like I could last week, does.

@UserHub1973 UserHub1973 added bug-unconfirmed chat gpt4all-chat issues labels Dec 13, 2024
@ThiloteE
Copy link
Collaborator

ThiloteE commented Dec 13, 2024

I respectfully disagree. Many models were not working with prior versions of GPT4All-Chat (v3.4.2 and prior) and users were forced to painfully change the chat template in the models tokenizer_config.json of the model (in particular the bos_token and eos_token), then quantize the model to gguf to make it work perfectly. Then side-load it or open a pull-request to add it to the models3.json to make it available to a broader userbase. The current de-facto standard language to write chat templates is some kind of Jinja and GPT4All was not able to properly parse complex templates before. It was mostly manageable, when the amount of models out there in the wild numbered only a few and chat templates were non-complex, but with the advent of tool-calling capabilities and models trained to respond to and recognize multiple bos / eos, GPT4All's exiting parsing scheme couldn't keep up. It also doesn't help that some finetuners don't know or don't care about chat-templates and simply copy-paste the one from the base-model and the people that quantize models usually don't change the template either. Many model authors have not kept compatibility with GPT4All, so now GPT4All has to adapt to keep up with model authors. I expect the switch over to a jinja parser, after initial problems have been straightened out and fixed, will in the mid to long term allow users to use MORE models than before, get rid of the workload related to quantizing and will apparently also allow some other shenanigans, such as editing a prompt.

Here is also a statement by Gonzochess75 from discord:

02:46]gonzochess75: We needed to enhance our templates for many new features that are coming. Our choices were to make our own templating language that no model would support out of the box and so every model would need custom config or use jinja which many models already have built in templates.
[02:47]gonzochess75: The ones that everyone is reporting are broken are models downloaded directly from hugging face and/or sideloaded. Not the models that we have hand curated by hand by us and are listed first deliberately.
[02:48]gonzochess75: The majority of users who are not technically inclined will use those curated models that are known to work. They won’t even necessarily know that huggingface is.
[02:50]gonzochess75: The current users that are feeling the most pain are the more technically literate users who are capable of downloading or even knowing about custom models or lesser know models. These are precisely the users we would expect to be most equipped to deal with these templates and understand the complexity involved and step up to help each other with working templates.
[02:50]gonzochess75: Many users are doing exactly that.
[02:51]gonzochess75: In the meantime we are - @cebtenzzre and myself - are working to fix the parsing bugs we have identified in the third party dependency - jinja2cpp - that is causing problems
[02:53]gonzochess75: I’m more than happy to explain the rationale behind this change.
[02:53]gonzochess75: It is the right direction and the right way to go and the pros will outweigh the cons.
[02:54]gonzochess75: At the same time I understand the pain some users are going through with the change and will continue to work to resolve and mitigate that pain as well as explain why it is necessary for anyone who cares to try and understand why
[03:37]furrykef: But why break compatibility?
[03:39]furrykef: (And why call it 3.5 instead of 4.0 if it's going to break so much stuff?)
[05:47]gonzochess75: Maintaining compatibility would drastically increase code complexity, increase UI complexity for settings, lead to confusion for users, cause compounding technical debt, delay and impede development of new features geared around tool calling and agentic features, and delay support for new models and the new features they might support.
[05:48]gonzochess75: I can explain in more detail each and everyone of these points if asked and would be happy to.
[05:51]gonzochess75: Chat templates are getting ever more complex as model creators add new features. The chat templates are written directly by the mode authors to show how the model is intended to be used. For better or worse the de facto standard language for writing these templates is jinja.
[05:53]gonzochess75: Unfortunately, the model authors often use syntax that is not supported by jinja2cpp - our new third party dependency - or problems occur with bugs in that new third party dependency. We’re working to figure out how to mitigate and solve these problems.

@ThiloteE ThiloteE changed the title New Update breaks a lot of models. It seems harmful to freedom and productivity. [Jinja] GPT4All-Chat 3.5.1 breaks TheBloke/OpenHermes-2.5-Mistral-7B-GGUF Dec 13, 2024
@ThiloteE
Copy link
Collaborator

ThiloteE commented Dec 13, 2024

But I agree, that maybe there should have been more testing and fixing a wider variety models, before pushing the update to users.

@nomic-ai nomic-ai deleted a comment from SINAPSA-IC Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from SINAPSA-IC Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
@manyoso manyoso closed this as completed Dec 13, 2024
@nomic-ai nomic-ai deleted a comment from UserHub1973 Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed chat gpt4all-chat issues
Projects
None yet
Development

No branches or pull requests

3 participants