如何直接使用模型参数启动模型呢

以 qwen2.5 为例，观察到咱参数都是通过 huggingface 直接拉取的方式来实现的，能否给定模型参数的地址运行此模型    
  qwen2.5-7b-instruct-l4:
    enabled: false
    url: "hf://Qwen/Qwen2.5-7B-Instruct"
    features: [TextGeneration]
    env:
      VLLM_ATTENTION_BACKEND: "FLASHINFER"
      # VLLM_USE_V1: "1"
    args:
      - --max-model-len=8192
      - --max-num-batched-token=8192
      - --max-num-seqs=256
      - --gpu-memory-utilization=0.95
      - --kv-cache-dtype=fp8
      - --enable-prefix-caching
      # - --enforce-eager
    engine: VLLM
    resourceProfile: 'nvidia-gpu-l4:1'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

如何直接使用模型参数启动模型呢 #521

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

如何直接使用模型参数启动模型呢 #521

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions