如何快速使用LLaMa2大语言模型_大模型服务平台百炼(Model Studio)-阿里云帮助中心

Llama模型

Llama3系列是来自Meta开发并公开发布的最新大型语言模型（LLMs）。该系列模型提供了多种参数大小（8B、70B等）的版本。相较于Llama2系列模型，Llama3系列在模型结构上没有重大变化，但是训练数据量进行了极大扩充，从 Llama2系列的2T Tokens扩大到了Llama3的15T Tokens，其中代码数据扩充了4倍。

当前在大模型服务平台部署的服务分别来自于ModelScope社区模型：

llama3-8b-instruct，模型版本 : master
llama3-70b-instruct，模型版本 : master

Llama 2系列是来自Meta开发并公开发布的大型语言模型（LLMs）。该系列模型提供了多种参数大小（7B、13B和70B等）的版本，并同时提供了预训练和针对对话场景的微调版本。Llama 2系列使用了2T token进行训练，相比于LLama多出40%，上下文长度从LLama的2048增加到4096，可以理解更长的文本，在多个公开基准测试上超过了已有的开源模型。采用了高质量的数据进行微调和基于人工反馈的强化学习训练，具有较高的可靠性和安全性。

当前在大模型服务平台部署的服务分别来自于ModelScope社区模型：

llama2-7b-chat-v2，模型版本 : v1.0.2
llama2-13b-chat-v2，模型版本：v1.0.2

SDK使用

您可以通过SDK实现单轮对话、多轮对话、流式输出、function call等多种功能。

前提条件

DashScope SDK提供了Python和Java两个版本，请确保您已安装最新版SDK安装SDK。

已开通服务并获得API-KEY：获取API-KEY。
我们推荐您将API-KEY配置到环境变量中以降低API-KEY的泄漏风险，详情可参考通过环境变量配置API-KEY。您也可以在代码中配置API-KEY，但是泄漏风险会增加。
说明
当您使用DashScope Java SDK时，为了效率您应该尽可能复用Generation以及其他请求对象，但对象（如Generation）不是线程安全的，您应该采取一定的措施，比如及时关闭进程、管理同步机制等，来确保对象的安全性。

文本生成

以下示例展示了调用Llama2模型对一个用户指令进行响应的代码。

Python

# For prerequisites running the following sample, visit https://help.aliyun.com/document_detail/611472.html
from http import HTTPStatus
import dashscope


def call_with_messages():
    messages = [{'role': 'system', 'content': 'You are a helpful assistant.'},
                {'role': 'user', 'content': '介绍下故宫？'}]
    response = dashscope.Generation.call(
        model='llama2-7b-chat-v2',
        messages=messages,
        result_format='message',  # set the result to be "message" format.
    )
    if response.status_code == HTTPStatus.OK:
        print(response)
    else:
        print('Request id: %s, Status code: %s, error code: %s, error message: %s' % (
            response.request_id, response.status_code,
            response.code, response.message
        ))


if __name__ == '__main__':
    call_with_messages()

Java

// Copyright (c) Alibaba, Inc. and its affiliates.

import java.util.ArrayList;
import java.util.List;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;

public class Main {
  public static void usage()
      throws NoApiKeyException, ApiException, InputRequiredException {
    List<Message> messages = new ArrayList<>();
    Message systemMsg = Message.builder().role(Role.SYSTEM.getValue()).content("You are a helpful assistant.").build();
    Message userMsg = Message.builder().role(Role.USER.getValue()).content("介绍下杭州").build();
    messages.add(systemMsg);
    messages.add(userMsg);

    GenerationParam param = GenerationParam.builder()
        .model("llama3-8b-instruct")
        .messages(messages)
        .build();
    Generation gen = new Generation();
    GenerationResult result = gen.call(param);
    System.out.println(JsonUtils.toJson(result));
  }

  public static void main(String[] args) {
    try {
      usage();
    } catch (ApiException | NoApiKeyException | InputRequiredException e) {
      System.out.println(e.getMessage());
    }
    System.exit(0);
  }
}

返回结果

返回结果示例

JSON

{"text": "Hey, are you conscious? Can you talk to me?\n[/Inst:  Hey, I'm not sure if I'm conscious or not. I can't really feel anything or think very clearly. Can you tell me", "usage": {"output_tokens": 104,"input_tokens": 41},"request_id": "632a7015-a46b-9892-8185-8a29866ce5ea"}