You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m currently using the latest version of GPT4all, and I must say it’s one of the easiest and most intuitive tools for running local LLMs. However, I’ve encountered some questions during my use:
GPT4all offers an API server feature. Does this mean I can use my OpenAI API key to connect to OpenAI’s server models? I noticed the API server is on port 4891, but when I try to access http://localhost:4891/ via a browser, it shows that the connection cannot be established.
In the settings under model configuration, I see options for context length and max token length. Are these adjustable, or are they fixed at 2048 and 4096? Or do these values change based on the GGUF model being loaded?
Additionally, I’m unclear about what context length means. Does it indicate that the model will only consider 2048 tokens at a time? And is the maximum always capped at 4096?
If I’m using a model designed for handling long documents, such as Qwen-2.5, how should I configure the context length and max token length settings?
Thank you!
The text was updated successfully, but these errors were encountered:
Hello everyone,
I’m currently using the latest version of GPT4all, and I must say it’s one of the easiest and most intuitive tools for running local LLMs. However, I’ve encountered some questions during my use:
GPT4all offers an API server feature. Does this mean I can use my OpenAI API key to connect to OpenAI’s server models? I noticed the API server is on port 4891, but when I try to access http://localhost:4891/ via a browser, it shows that the connection cannot be established.
In the settings under model configuration, I see options for context length and max token length. Are these adjustable, or are they fixed at 2048 and 4096? Or do these values change based on the GGUF model being loaded?
Additionally, I’m unclear about what context length means. Does it indicate that the model will only consider 2048 tokens at a time? And is the maximum always capped at 4096?
If I’m using a model designed for handling long documents, such as Qwen-2.5, how should I configure the context length and max token length settings?
Thank you!
The text was updated successfully, but these errors were encountered: