gemma-4-26B-A4B-it-speculator.eagle3

Not working in VLLM

by pachePizza - opened 3 days ago

Discussion

pachePizza

3 days ago

Is it possible that this model is not working on vllm currently? https://github.com/vllm-project/vllm/issues/38893

fynnsu

Red Hat AI org 3 days ago

That issue is outdated, I've requested for it to be closed. The usage command on the model card works on latest main.

fynnsu changed discussion status to closed 3 days ago

pachePizza

3 days ago

I am using the following image: gemma4-cu130 (work with all gemma4):
agle3-ts-ccdc996d6-ckht5 -c vllm 2>/dev/null | tail -30

(APIServer pid=1) return await main
(APIServer pid=1) ^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 672, in run_server
(APIServer pid=1) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 686, in run_server_worker
(APIServer pid=1) async with build_async_engine_client(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=1) async with build_async_engine_client_from_engine_args(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=1) return await anext(self.gen)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 124, in build_async_engine_client_from_engine_args
(APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1804, in create_engine_config
(APIServer pid=1) speculative_config = self.create_speculative_config(
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1514, in create_speculative_config
(APIServer pid=1) return SpeculativeConfig(**self.speculative_config)
(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in init
(APIServer pid=1) s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=1) pydantic_core._pydantic_core.ValidationError: 1 validation error for SpeculativeConfig
(APIServer pid=1) Value error, eagle3 is only supported for ['llama', 'qwen', 'minicpm', 'gpt_oss', 'hunyuan_vl', 'hunyuan_v1_dense', 'afmoe', 'nemotron_h', 'deepseek_v2', 'deepseek_v3', 'kimi_k2', 'kimi_k25'] models. Got self.target_model_config.hf_text_config.model_type='gemma4_text' [type=value_error, input_value=ArgsKwargs((), {'method':..., _api_process_rank=0)}), input_type=ArgsKwargs]
(APIServer pid=1) For further information visit https://errors.pydantic.dev/2.12/v/value_error

fynnsu

Red Hat AI org about 4 hours ago

Unfortunately I don't think support for this has landed in the gemma4 tagged releases. Could you try with the latest nightly release?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment