1
0
md/20251011.md
liushuang 98687f0f84 add
2025-10-11 17:01:08 +08:00

215 lines
17 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## 问题原因
vllm在处理请求时调用了 get_structured_output_key 函数,在处理 structured_outputs 参数抛异常,该异常未被 EngineCore 捕获导致引擎崩溃APIServer 发现引擎死了,自己也退出了进程。
## 源码的具体实现
### 函数 get_structured_output_key 实现
#### 分支release/v0.11.0
https://github.com/vllm-project/vllm/blob/releases/v0.11.0/vllm/v1/structured_output/request.py
```python
def get_structured_output_key(sampling_params: SamplingParams) -> StructuredOutputKey:
params = sampling_params.structured_outputs
assert params is not None, "params can't be None."
if params.json is not None:
if not isinstance(params.json, str):
json_str = json.dumps(params.json)
else:
json_str = params.json
return (StructuredOutputOptions.JSON, json_str)
elif params.json_object:
return (StructuredOutputOptions.JSON_OBJECT, "")
elif params.regex is not None:
return (StructuredOutputOptions.REGEX, params.regex)
elif params.choice is not None:
if not isinstance(params.choice, str):
json_str = json.dumps(params.choice)
else:
json_str = params.choice
return (StructuredOutputOptions.CHOICE, json_str)
elif params.grammar is not None:
return (StructuredOutputOptions.GRAMMAR, params.grammar)
elif params.structural_tag is not None:
return (StructuredOutputOptions.STRUCTURAL_TAG, params.structural_tag)
else:
raise ValueError("No valid structured output parameter found")
```
#### 分支release/v0.10.2
https://github.com/vllm-project/vllm/blob/releases/v0.10.2/vllm/v1/structured_output/request.py
```python
def get_structured_output_key(
sampling_params: SamplingParams) -> StructuredOutputKey:
params = sampling_params.guided_decoding
assert params is not None, "params can't be None."
if params.json is not None:
if not isinstance(params.json, str):
json_str = json.dumps(params.json)
else:
json_str = params.json
return (StructuredOutputOptions.JSON, json_str)
elif params.json_object:
return (StructuredOutputOptions.JSON_OBJECT, "")
elif params.regex is not None:
return (StructuredOutputOptions.REGEX, params.regex)
elif params.choice is not None:
if not isinstance(params.choice, str):
json_str = json.dumps(params.choice)
else:
json_str = params.choice
return (StructuredOutputOptions.CHOICE, json_str)
elif params.grammar is not None:
return (StructuredOutputOptions.GRAMMAR, params.grammar)
elif params.structural_tag is not None:
return (StructuredOutputOptions.STRUCTURAL_TAG, params.structural_tag)
else:
raise ValueError("No valid structured output parameter found")
```
> 2个版本的唯一区别仅仅是提取`SamplingParams`的属性不一样,其它判断逻辑都是一致的
### SamplingParams
https://github.com/vllm-project/vllm/blob/releases/v0.11.0/vllm/sampling_params.py
```python
# 其它参数已省略
class SamplingParams(
💡初始化逻辑是没问题的默认值是None
structured_outputs: Optional[StructuredOutputsParams] = None
guided_decoding: Optional[GuidedDecodingParams] = None
@staticmethod
def from_optional(
💡默认值也是None
structured_outputs: Optional[StructuredOutputsParams] = None,
) -> "SamplingParams":
if guided_decoding is not None:
warnings.warn(
"guided_decoding is deprecated. This will be removed in "
💡官方将在 v0.12.0 废弃 guided_decoding 参数使用 structured_outputs 参数替代 v0.11.0 版本做了兼容
"v0.12.0 or v1.0.0, which ever is soonest. Please use "
"structured_outputs instead.",
DeprecationWarning,
stacklevel=2)
structured_outputs = guided_decoding
guided_decoding = None
return SamplingParams(
structured_outputs=structured_outputs,
)
```
### StructuredOutputOptions
```python
class StructuredOutputOptions(enum.Enum):
JSON = enum.auto()
JSON_OBJECT = enum.auto()
REGEX = enum.auto()
GRAMMAR = enum.auto()
CHOICE = enum.auto()
STRUCTURAL_TAG = enum.auto()
```
> 💡只支持这6种类型每个类型都对应 structured_outputs 下面的一个不同的参数。
## 崩溃日志片段
```text
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] Traceback (most recent call last):
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 701, in run_engine_core
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] engine_core.run_busy_loop()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 728, in run_busy_loop
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] self._process_engine_step()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 754, in _process_engine_step
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 283, in step
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] scheduler_output = self.scheduler.schedule()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/core/sched/scheduler.py", line 359, in schedule
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] if structured_output_req and structured_output_req.grammar:
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 45, in grammar
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] completed = self._check_grammar_completion()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 33, in _check_grammar_completion
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] self._grammar = self._grammar.result(timeout=0.0001)
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 458, in result
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] return self.__get_result()
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] raise self._exception
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/thread.py", line 58, in run
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] result = self.fn(*self.args, **self.kwargs)
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/__init__.py", line 128, in _async_create_grammar
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] key = request.structured_output_request.structured_output_key # type: ignore[union-attr]
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/functools.py", line 981, in __get__
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] val = self.func(instance)
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 58, in structured_output_key
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] return get_structured_output_key(self.sampling_params)
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 86, in get_structured_output_key
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] raise ValueError("No valid structured output parameter found")
(EngineCore_DP0 pid=2738693) ERROR 10-10 10:43:10 [core.py:710] ValueError: No valid structured output parameter found
(EngineCore_DP0 pid=2738693) Process EngineCore_DP0:
(EngineCore_DP0 pid=2738693) Traceback (most recent call last):
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] AsyncLLM output_handler failed.
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] Traceback (most recent call last):
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 439, in output_handler
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] outputs = await engine_core.get_output_async()
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core_client.py", line 846, in get_output_async
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] raise self._format_exception(outputs) from None
(APIServer pid=2738423) ERROR 10-10 10:43:10 [async_llm.py:480] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore_DP0 pid=2738693) self.run()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=2738693) self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 712, in run_engine_core
(EngineCore_DP0 pid=2738693) raise e
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 701, in run_engine_core
(EngineCore_DP0 pid=2738693) engine_core.run_busy_loop()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 728, in run_busy_loop
(EngineCore_DP0 pid=2738693) self._process_engine_step()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 754, in _process_engine_step
(EngineCore_DP0 pid=2738693) outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 283, in step
(EngineCore_DP0 pid=2738693) scheduler_output = self.scheduler.schedule()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/core/sched/scheduler.py", line 359, in schedule
(EngineCore_DP0 pid=2738693) if structured_output_req and structured_output_req.grammar:
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 45, in grammar
(EngineCore_DP0 pid=2738693) completed = self._check_grammar_completion()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 33, in _check_grammar_completion
(EngineCore_DP0 pid=2738693) self._grammar = self._grammar.result(timeout=0.0001)
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 458, in result
(EngineCore_DP0 pid=2738693) return self.__get_result()
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
(EngineCore_DP0 pid=2738693) raise self._exception
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/concurrent/futures/thread.py", line 58, in run
(EngineCore_DP0 pid=2738693) result = self.fn(*self.args, **self.kwargs)
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/__init__.py", line 128, in _async_create_grammar
(EngineCore_DP0 pid=2738693) key = request.structured_output_request.structured_output_key # type: ignore[union-attr]
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/functools.py", line 981, in __get__
(EngineCore_DP0 pid=2738693) val = self.func(instance)
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 58, in structured_output_key
(EngineCore_DP0 pid=2738693) return get_structured_output_key(self.sampling_params)
(EngineCore_DP0 pid=2738693) File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/structured_output/request.py", line 86, in get_structured_output_key
(EngineCore_DP0 pid=2738693) raise ValueError("No valid structured output parameter found")
(EngineCore_DP0 pid=2738693) ValueError: No valid structured output parameter found
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_chat.py", line 574, in chat_completion_stream_generator
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] async for res in result_generator:
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 387, in generate
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] out = q.get_nowait() or await q.get()
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/output_processor.py", line 59, in get
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] raise output
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 439, in output_handler
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] outputs = await engine_core.get_output_async()
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] File "/aisoft/conda/env/vllm2/lib/python3.10/site-packages/vllm/v1/engine/core_client.py", line 846, in get_output_async
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] raise self._format_exception(outputs) from None
(APIServer pid=2738423) ERROR 10-10 10:43:10 [serving_chat.py:1145] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
[rank0]:[W1010 10:43:10.666166970 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=2738423) INFO: Shutting down
(APIServer pid=2738423) INFO: Waiting for application shutdown.
(APIServer pid=2738423) INFO: Application shutdown complete.
(APIServer pid=2738423) INFO: Finished server process [2738423]
```