minicpm启动没问题，推理访问报错 #292

760485464 · 2024-06-28T11:07:37Z

提交前必须检查以下项目 | The following items must be checked before submission

请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

启动无报错，api访问报错访问内容
{
"model": "minicpm-v",
"stream":false,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "这张图片是什么地方？"
},
{
"type": "image_url",
"image_url": {
"url": "http://djclub.cdn.bcebos.com/uploads/images/pageimg/20230325/64-2303252115313.jpg"
}
}
]
}
]
}

Dependencies

No response

运行日志或截图 | Runtime logs or screenshots

(minicpm) root@autodl-container-acc74095be-7fd6b47a:~/autodl-tmp/api-for-open-llm# python server.py
2024-06-28 19:01:43.514 | DEBUG | api.config::281 - SETTINGS: {
"model_name": "minicpm-v",
"model_path": "/root/autodl-tmp/models/MiniCPM-Llama3-V-2_5",
"dtype": "bfloat16",
"load_in_8bit": false,
"load_in_4bit": false,
"context_length": 2048,
"chat_template": "minicpm-v",
"rope_scaling": null,
"flash_attn": false,
"interrupt_requests": true,
"host": "0.0.0.0",
"port": 8000,
"api_prefix": "/v1",
"engine": "default",
"tasks": [
"llm"
],
"device_map": "auto",
"gpus": null,
"num_gpus": 1,
"activate_inference": true,
"model_names": [
"minicpm-v"
],
"api_keys": null
}
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:10<00:00, 1.43s/it]
2024-06-28 19:02:01.770 | INFO | api.models:create_hf_llm:81 - Using HuggingFace Engine
2024-06-28 19:02:01.770 | INFO | api.engine.hf:init:82 - Using minicpm-v Model for Chat!
2024-06-28 19:02:01.770 | INFO | api.engine.hf:init:83 - Using <api.templates.base.ChatTemplate object at 0x7f08429a4460> for Chat!
INFO: Started server process [1092]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2024-06-28 19:02:24.431 | DEBUG | api.routes.chat:create_chat_completion:56 - ==== request ====
{'model': 'glm-4v', 'frequency_penalty': 0.0, 'function_call': None, 'functions': None, 'logit_bias': None, 'logprobs': False, 'max_tokens': 1024, 'n': 1, 'presence_penalty': 0.0, 'response_format': None, 'seed': None, 'stop': [], 'temperature': 0.9, 'tool_choice': None, 'tools': None, 'top_logprobs': None, 'top_p': 1.0, 'user': None, 'stream': False, 'repetition_penalty': 1.03, 'typical_p': None, 'watermark': False, 'best_of': 1, 'ignore_eos': False, 'use_beam_search': False, 'stop_token_ids': [], 'skip_special_tokens': True, 'spaces_between_special_tokens': True, 'min_p': 0.0, 'include_stop_str_in_output': False, 'length_penalty': 1.0, 'guided_json': None, 'guided_regex': None, 'guided_choice': None, 'guided_grammar': None, 'guided_decoding_backend': None, 'prompt_or_messages': [{'role': 'user', 'content': '你好'}], 'echo': False}
Exception in thread Thread-2:
Traceback (most recent call last):
File "/root/miniconda3/envs/minicpm/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/root/miniconda3/envs/minicpm/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/transformers/generation/utils.py", line 2693, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

The text was updated successfully, but these errors were encountered:

liHai001 · 2024-08-06T11:48:32Z

请问下解决了吗？我用stream访问，一个请求没问题，2个并发请求，就报这个错误了，不知道是不是封装的MiniCPMV 的mode.generate里的stream的多线程有问题

xusenlinzy · 2024-08-07T03:54:58Z

暂时不支持并发的

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minicpm启动没问题，推理访问报错 #292

minicpm启动没问题，推理访问报错 #292

760485464 commented Jun 28, 2024

liHai001 commented Aug 6, 2024

xusenlinzy commented Aug 7, 2024

minicpm启动没问题，推理访问报错 #292

minicpm启动没问题，推理访问报错 #292

Comments

760485464 commented Jun 28, 2024

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

操作系统 | Operating system

详细描述问题 | Detailed description of the problem

Dependencies

运行日志或截图 | Runtime logs or screenshots

liHai001 commented Aug 6, 2024

xusenlinzy commented Aug 7, 2024