Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minicpm启动没问题,推理访问报错 #292

Open
2 tasks done
760485464 opened this issue Jun 28, 2024 · 2 comments
Open
2 tasks done

minicpm启动没问题,推理访问报错 #292

760485464 opened this issue Jun 28, 2024 · 2 comments

Comments

@760485464
Copy link

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

启动无报错,api访问报错访问内容
{
"model": "minicpm-v",
"stream":false,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "这张图片是什么地方?"
},
{
"type": "image_url",
"image_url": {
"url": "http://djclub.cdn.bcebos.com/uploads/images/pageimg/20230325/64-2303252115313.jpg"
}
}
]
}
]
}

Dependencies

No response

运行日志或截图 | Runtime logs or screenshots

(minicpm) root@autodl-container-acc74095be-7fd6b47a:~/autodl-tmp/api-for-open-llm# python server.py
2024-06-28 19:01:43.514 | DEBUG | api.config::281 - SETTINGS: {
"model_name": "minicpm-v",
"model_path": "/root/autodl-tmp/models/MiniCPM-Llama3-V-2_5",
"dtype": "bfloat16",
"load_in_8bit": false,
"load_in_4bit": false,
"context_length": 2048,
"chat_template": "minicpm-v",
"rope_scaling": null,
"flash_attn": false,
"interrupt_requests": true,
"host": "0.0.0.0",
"port": 8000,
"api_prefix": "/v1",
"engine": "default",
"tasks": [
"llm"
],
"device_map": "auto",
"gpus": null,
"num_gpus": 1,
"activate_inference": true,
"model_names": [
"minicpm-v"
],
"api_keys": null
}
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:10<00:00, 1.43s/it]
2024-06-28 19:02:01.770 | INFO | api.models:create_hf_llm:81 - Using HuggingFace Engine
2024-06-28 19:02:01.770 | INFO | api.engine.hf:init:82 - Using minicpm-v Model for Chat!
2024-06-28 19:02:01.770 | INFO | api.engine.hf:init:83 - Using <api.templates.base.ChatTemplate object at 0x7f08429a4460> for Chat!
INFO: Started server process [1092]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2024-06-28 19:02:24.431 | DEBUG | api.routes.chat:create_chat_completion:56 - ==== request ====
{'model': 'glm-4v', 'frequency_penalty': 0.0, 'function_call': None, 'functions': None, 'logit_bias': None, 'logprobs': False, 'max_tokens': 1024, 'n': 1, 'presence_penalty': 0.0, 'response_format': None, 'seed': None, 'stop': [], 'temperature': 0.9, 'tool_choice': None, 'tools': None, 'top_logprobs': None, 'top_p': 1.0, 'user': None, 'stream': False, 'repetition_penalty': 1.03, 'typical_p': None, 'watermark': False, 'best_of': 1, 'ignore_eos': False, 'use_beam_search': False, 'stop_token_ids': [], 'skip_special_tokens': True, 'spaces_between_special_tokens': True, 'min_p': 0.0, 'include_stop_str_in_output': False, 'length_penalty': 1.0, 'guided_json': None, 'guided_regex': None, 'guided_choice': None, 'guided_grammar': None, 'guided_decoding_backend': None, 'prompt_or_messages': [{'role': 'user', 'content': '你好'}], 'echo': False}
Exception in thread Thread-2:
Traceback (most recent call last):
File "/root/miniconda3/envs/minicpm/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/root/miniconda3/envs/minicpm/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
File "/root/miniconda3/envs/minicpm/lib/python3.8/site-packages/transformers/generation/utils.py", line 2693, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

@liHai001
Copy link

liHai001 commented Aug 6, 2024

请问下解决了吗?我用stream访问,一个请求没问题,2个并发请求,就报这个错误了,不知道是不是封装的MiniCPMV 的mode.generate里的stream的多线程有问题

@xusenlinzy
Copy link
Owner

暂时不支持并发的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants