Not working in VLLM

#1
by pachePizza - opened

Is it possible that this model is not working on vllm currently? https://github.com/vllm-project/vllm/issues/38893

Red Hat AI org

That issue is outdated, I've requested for it to be closed. The usage command on the model card works on latest main.

fynnsu changed discussion status to closed

I am using the following image: gemma4-cu130 (work with all gemma4):
agle3-ts-ccdc996d6-ckht5 -c vllm 2>/dev/null | tail -30

(APIServer pid=1) return await main
(APIServer pid=1) ^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 672, in run_server
(APIServer pid=1) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 686, in run_server_worker
(APIServer pid=1) async with build_async_engine_client(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=1) async with build_async_engine_client_from_engine_args(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 124, in build_async_engine_client_from_engine_args
(APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1804, in create_engine_config
(APIServer pid=1) speculative_config = self.create_speculative_config(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1514, in create_speculative_config
(APIServer pid=1) return SpeculativeConfig(**self.speculative_config)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in init
(APIServer pid=1) s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=1) pydantic_core._pydantic_core.ValidationError: 1 validation error for SpeculativeConfig
(APIServer pid=1) Value error, eagle3 is only supported for ['llama', 'qwen', 'minicpm', 'gpt_oss', 'hunyuan_vl', 'hunyuan_v1_dense', 'afmoe', 'nemotron_h', 'deepseek_v2', 'deepseek_v3', 'kimi_k2', 'kimi_k25'] models. Got self.target_model_config.hf_text_config.model_type='gemma4_text' [type=value_error, input_value=ArgsKwargs((), {'method':..., _api_process_rank=0)}), input_type=ArgsKwargs]
(APIServer pid=1) For further information visit https://errors.pydantic.dev/2.12/v/value_error

Red Hat AI org

Unfortunately I don't think support for this has landed in the gemma4 tagged releases. Could you try with the latest nightly release?

Sign up or log in to comment