## 问题原因 在 get_structured_output_key 中处理 structured_outputs 参数抛异常,该异常未被 EngineCore 捕获导致引擎崩溃,APIServer 发现引擎死了,自己也退出了进程。 ## 源码的具体实现 ### 函数 get_structured_output_key 实现 #### 分支:release/v0.11.0 https://github.com/vllm-project/vllm/blob/releases/v0.11.0/vllm/v1/structured_output/request.py ```python def get_structured_output_key(sampling_params: SamplingParams) -> StructuredOutputKey: params = sampling_params.structured_outputs assert params is not None, "params can't be None." if params.json is not None: if not isinstance(params.json, str): json_str = json.dumps(params.json) else: json_str = params.json return (StructuredOutputOptions.JSON, json_str) elif params.json_object: return (StructuredOutputOptions.JSON_OBJECT, "") elif params.regex is not None: return (StructuredOutputOptions.REGEX, params.regex) elif params.choice is not None: if not isinstance(params.choice, str): json_str = json.dumps(params.choice) else: json_str = params.choice return (StructuredOutputOptions.CHOICE, json_str) elif params.grammar is not None: return (StructuredOutputOptions.GRAMMAR, params.grammar) elif params.structural_tag is not None: return (StructuredOutputOptions.STRUCTURAL_TAG, params.structural_tag) else: raise ValueError("No valid structured output parameter found") ``` #### 分支:release/v0.10.2 https://github.com/vllm-project/vllm/blob/releases/v0.10.2/vllm/v1/structured_output/request.py ```python def get_structured_output_key( sampling_params: SamplingParams) -> StructuredOutputKey: params = sampling_params.guided_decoding assert params is not None, "params can't be None." if params.json is not None: if not isinstance(params.json, str): json_str = json.dumps(params.json) else: json_str = params.json return (StructuredOutputOptions.JSON, json_str) elif params.json_object: return (StructuredOutputOptions.JSON_OBJECT, "") elif params.regex is not None: return (StructuredOutputOptions.REGEX, params.regex) elif params.choice is not None: if not isinstance(params.choice, str): json_str = json.dumps(params.choice) else: json_str = params.choice return (StructuredOutputOptions.CHOICE, json_str) elif params.grammar is not None: return (StructuredOutputOptions.GRAMMAR, params.grammar) elif params.structural_tag is not None: return (StructuredOutputOptions.STRUCTURAL_TAG, params.structural_tag) else: raise ValueError("No valid structured output parameter found") ``` > 2个版本的唯一区别,仅仅是提取`SamplingParams`的属性不一样,其它判断逻辑都是一致的 ### SamplingParams https://github.com/vllm-project/vllm/blob/releases/v0.11.0/vllm/sampling_params.py ```python # 其它参数已省略 class SamplingParams( 💡初始化逻辑是没问题的,默认值是None structured_outputs: Optional[StructuredOutputsParams] = None guided_decoding: Optional[GuidedDecodingParams] = None @staticmethod def from_optional( 💡默认值也是None structured_outputs: Optional[StructuredOutputsParams] = None, ) -> "SamplingParams": if guided_decoding is not None: warnings.warn( "guided_decoding is deprecated. This will be removed in " 💡官方将在 v0.12.0 废弃 guided_decoding 参数,使用 structured_outputs 参数替代,在 v0.11.0 版本做了兼容, "v0.12.0 or v1.0.0, which ever is soonest. Please use " "structured_outputs instead.", DeprecationWarning, stacklevel=2) structured_outputs = guided_decoding guided_decoding = None return SamplingParams( structured_outputs=structured_outputs, ) ``` ### StructuredOutputOptions ```python class StructuredOutputOptions(enum.Enum): JSON = enum.auto() JSON_OBJECT = enum.auto() REGEX = enum.auto() GRAMMAR = enum.auto() CHOICE = enum.auto() STRUCTURAL_TAG = enum.auto() ``` > 💡只支持这6种类型,每个类型都对应 structured_outputs 下面的一个不同的参数。 ## 推断 客户端上报了 structured_outputs 参数,但是一个空对象,或者 structured_outputs 参数下面不包含这6种属性,引起了异常。 ## 复现 ## 崩溃日志片段 ```text (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] EngineCore encountered a fatal error. (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] Traceback (most recent call last): (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 701, in run_engine_core (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] engine_core.run_busy_loop() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 728, in run_busy_loop (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] self._process_engine_step() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 754, in _process_engine_step (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] outputs, model_executed = self.step_fn() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 283, in step (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] scheduler_output = self.scheduler.schedule() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/core/sched/scheduler.py", line 359, in schedule (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] if structured_output_req and structured_output_req.grammar: (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 45, in grammar (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] completed = self._check_grammar_completion() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 33, in _check_grammar_completion (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] self._grammar = self._grammar.result(timeout=0.0001) (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 458, in result (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] return self.__get_result() (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] raise self._exception (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/thread.py", line 58, in run (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] result = self.fn(*self.args, **self.kwargs) (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/__init__.py", line 128, in _async_create_grammar (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] key = request.structured_output_request.structured_output_key # type: ignore[union-attr] (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/functools.py", line 981, in __get__ (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] val = self.func(instance) (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 58, in structured_output_key (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] return get_structured_output_key(self.sampling_params) (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 86, in get_structured_output_key (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] raise ValueError("No valid structured output parameter found") (EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] ValueError: No valid structured output parameter found (EngineCore_DP0 pid=2738693) Process EngineCore_DP0: (EngineCore_DP0 pid=2738693) Traceback (most recent call last): (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] AsyncLLM output_handler failed. (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] Traceback (most recent call last): (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 439, in output_handler (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] outputs = await engine_core.get_output_async() (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core_client.py", line 846, in get_output_async (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] raise self._format_exception(outputs) from None (APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause. (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap (EngineCore_DP0 pid=2738693) self.run() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/multiprocessing/process.py", line 108, in run (EngineCore_DP0 pid=2738693) self._target(*self._args, **self._kwargs) (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 712, in run_engine_core (EngineCore_DP0 pid=2738693) raise e (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 701, in run_engine_core (EngineCore_DP0 pid=2738693) engine_core.run_busy_loop() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 728, in run_busy_loop (EngineCore_DP0 pid=2738693) self._process_engine_step() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 754, in _process_engine_step (EngineCore_DP0 pid=2738693) outputs, model_executed = self.step_fn() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 283, in step (EngineCore_DP0 pid=2738693) scheduler_output = self.scheduler.schedule() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/core/sched/scheduler.py", line 359, in schedule (EngineCore_DP0 pid=2738693) if structured_output_req and structured_output_req.grammar: (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 45, in grammar (EngineCore_DP0 pid=2738693) completed = self._check_grammar_completion() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 33, in _check_grammar_completion (EngineCore_DP0 pid=2738693) self._grammar = self._grammar.result(timeout=0.0001) (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 458, in result (EngineCore_DP0 pid=2738693) return self.__get_result() (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result (EngineCore_DP0 pid=2738693) raise self._exception (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/thread.py", line 58, in run (EngineCore_DP0 pid=2738693) result = self.fn(*self.args, **self.kwargs) (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/__init__.py", line 128, in _async_create_grammar (EngineCore_DP0 pid=2738693) key = request.structured_output_request.structured_output_key # type: ignore[union-attr] (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/functools.py", line 981, in __get__ (EngineCore_DP0 pid=2738693) val = self.func(instance) (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 58, in structured_output_key (EngineCore_DP0 pid=2738693) return get_structured_output_key(self.sampling_params) (EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 86, in get_structured_output_key (EngineCore_DP0 pid=2738693) raise ValueError("No valid structured output parameter found") (EngineCore_DP0 pid=2738693) ValueError: No valid structured output parameter found (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] Error in chat completion stream generator. (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] Traceback (most recent call last): (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_chat.py", line 574, in chat_completion_stream_generator (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] async for res in result_generator: (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 387, in generate (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] out = q.get_nowait() or await q.get() (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/output_processor.py", line 59, in get (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] raise output (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 439, in output_handler (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] outputs = await engine_core.get_output_async() (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core_client.py", line 846, in get_output_async (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] raise self._format_exception(outputs) from None (APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause. [rank0]:[W1010 10:43:10.666166970 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) (APIServer pid=2738423) INFO: Shutting down (APIServer pid=2738423) INFO: Waiting for application shutdown. (APIServer pid=2738423) INFO: Application shutdown complete. (APIServer pid=2738423) INFO: Finished server process [2738423] ```