You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently Locust supports a simple format of taking in a list of prompts, output lengths are not considered in the input prompt format. This was done to make it easy to use different benchmarking data sets. LPG only works with the single dataset format, which enables LPG to send requests with various max output lengths.
LPG currently loads dataset directly into the container and at runtime will filter out any prompts with input len and output len that exceed the max.
in load_data.py, update the filtering to take the output_len into account. Save the prompt and output len in local dataset
in tasks.py, load the prompt + output_len in load_dataset function, use the output_len in the request max_output_len field.
(priority TBD) ensure continued support for simple list of prompts format (backwards compatibility to old locust request behavior) - eg. gate the above behavior behind a flag?
The text was updated successfully, but these errors were encountered:
Currently Locust supports a simple format of taking in a list of prompts, output lengths are not considered in the input prompt format. This was done to make it easy to use different benchmarking data sets. LPG only works with the single dataset format, which enables LPG to send requests with various max output lengths.
LPG currently loads dataset directly into the container and at runtime will filter out any prompts with input len and output len that exceed the max.
ai-on-gke/benchmarks/benchmark/tools/profile-generator/container/benchmark_serving.py
Line 53 in 9ff340f
When sending the request, LPG will input the prompt's output length as the max output length -
ai-on-gke/benchmarks/benchmark/tools/profile-generator/container/benchmark_serving.py
Line 145 in 9ff340f
Locust requires these updates to match the LPG request behavior:
The text was updated successfully, but these errors were encountered: