Third-party agent integration

更新时间:
复制 MD 格式

If the built-in skills or agents in the multimodal interaction development suite do not meet your needs, you can call a third-party agent. This topic describes how to integrate the capabilities of a third-party agent.

Protocol standard

The integration must be based on the Google A2A protocol. For more information, see the A2A 0.2.5 specification.

Integration process

1. Configure AgentCard

image

  • For the response and an example for step 3, see AgentCard.

2. Call the agent

image

Field descriptions

AgentCard

Field

Type

Required

Description

name

String

Yes

The agent name.

description

String

Yes

The agent description.

url

String

Yes

The base HTTPS URL of the agent service.

When streaming is supported, the /stream path suffix is automatically appended.

version

String

Yes

The agent version.

protocolVersion

String

Yes

The A2A protocol version that the agent supports.

capabilities

AgentCapabilities

Yes

Specifies the optional A2A protocol features supported, such as streaming.

security

String[]

No

The security requirements for communicating with the agent.

The apiKey security policy is supported. When configured, requests to the agent include the following header:

X-API-KEY: <value> // The value is configured in the console.

defaultInputModes

String[]

Yes

The input media types that the agent accepts.

Currently, "text/plain" is supported.

defaultOutputModes

String[]

Yes

The output media types that the agent generates.

Currently, "text/plain" is supported.

skills

AgentSkill[]

Yes

A list of agent skills. At least one item is required.

AgentCapabilities

Field Name

Type

Required

Description

streaming

Boolean

No

Specifies whether Server-Sent Events (SSE) streaming is supported.

extensions

AgentExtension[]

No

A list of supported extensions.

AgentExtension

Field Name

Type

Required

Description

uri

String

Yes

The URI of the supported extension. The URI includes a description of the extension's capabilities.

params

Map<String, Object>

No

Configuration parameters.

AgentSkill

Field Name

Type

Required

Description

id

String

Yes

A unique skill identifier within the agent.

name

String

Yes

The skill name.

description

String

Yes

The skill description.

tags

String[]

Yes

Keywords for the skill.

examples

String[]

No

Usage examples for the skill.

inputModes

String[]

No

The accepted media types. If set, this overwrites defaultInputModes.

outputModes

String[]

No

The output media types. If set, this overwrites defaultOutputModes.

Example

{
  "name": "Super AI Assistant",
  "description": "Repeats user input, calculates the sum of two numbers, counts user sentences, triggers a flash, and provides coaching for basketball and football. A versatile assistant.",
  "protocolVersion": "0.2.5",
  "url": "https://example/a2a/demo/v1",
  "version": "1.0.0",
  "capabilities": {
    "streaming": true,
    "extensions": []
  },
  "security": [],
  "defaultInputModes": [
    "text/plain"
  ],
  "defaultOutputModes": [
    "text/plain"
  ],
  "skills": [
    {
      "id": "ai-repeat",
      "name": "AI Repeater",
      "description": "Repeats what the user says.",
      "tags": [
        "demo",
        "repeat"
      ],
      "examples": [
        "Example: Repeat what I said."
      ]
    },
    {
      "id": "ai-calculate",
      "name": "AI Calculator",
      "description": "Calculates the 'sum' of two numbers.",
      "tags": [
        "demo",
        "calculate"
      ],
      "examples": [
        "Example: What is 1 plus 2?"
      ]
    },
    {
      "id": "ai-count",
      "name": "AI Counter",
      "description": "Records and counts the number of sentences the user has said.",
      "tags": [
        "demo",
        "count"
      ],
      "examples": [
        "Example: Count how many sentences I have said."
      ]
    },
    {
      "id": "ai-flash",
      "name": "AI Flash",
      "description": "Can perform a flash.",
      "tags": [
        "demo",
        "flash"
      ],
      "examples": [
        "Example: Perform a flash."
      ]
    },
    {
      "id": "ai-coach",
      "name": "AI Coach",
      "description": "Can teach you how to play basketball and football.",
      "tags": [
        "demo",
        "coach"
      ],
      "examples": [
        "Example: How to play basketball well."
      ]
    }
  ]
}

Agent call request fields

Field Name

Type

Required

Description

jsonrpc

String

Yes

Fixed value: "2.0".

method

String

Yes

  • "message/stream": For requests that support SSE streaming.

  • "message/send" does not support SSE streaming requests.

params

MessageSendParams

Yes

Request parameters.

id

String

Yes

The request ID.

MessageSendParams

Field Name

Type

Required

Description

message

Message

Yes

The content of the message to send.

Message

Field Name

Type

Required

Description

kind

String

Yes

Fixed value: "message".

role

String

Yes

  • "user": The message is sent by the user.

  • "agent": The message is returned by the agent.

parts

Part[]

Yes

An array of content parts. It must contain at least one part.

messageId

String

Yes

The message identifier generated by the message sender.

contextId

String

No

The context identifier associated with the message.

metadata

Map<String, Object>

No

Metadata associated with this message.

Part

Field name

Type

Required

Description

kind

String

Yes

Identifies the content type of this part. For example, 'text' indicates a text type.

text

String

No

The text content of the part. Fill this in when kind=text.

Examples

HTTP request
{
  "jsonrpc": "2.0",
  "id": "request-1",
  "method": "message/send",
  "params": {
    "message": {
      "messageId": "msg-1",
      "kind": "message",
      "role": "user",
      "parts": [
        {
          "kind": "text",
          "text": "Will it rain today?"
        }
      ]
    }
  }
}
HTTP SSE request
{
  "jsonrpc": "2.0",
  "id": "request-1",
  "method": "message/stream",
  "params": {
    "message": {
      "messageId": "msg-1",
      "kind": "message",
      "role": "user",
      "parts": [
        {
          "kind": "text",
          "text": "Will it rain today?"
        }
      ]
    }
  }
}

Agent call response fields

Field Name

Type

Required

Description

jsonrpc

String

Yes

Fixed value: "2.0".

id

String

Yes

The request ID. It is the same as the value of JSONRPCRequest.id.

result

Task | TaskStatusUpdateEvent | TaskArtifactUpdateEvent

No

Returned when the request is processed successfully.

error

JSONRpcError

No

Returned when the request fails to be processed.

Task

Field Name

Type

Required

Description

id

String

Yes

A unique task identifier generated by the server, such as a UUID.

contextId

String

Yes

A server-generated ID for context alignment across multiple interaction turns.

status

TaskStatus

Yes

The current status of the task.

artifacts

Artifact[]

No

The output generated by the agent in this task.

TaskStatus

Field name

Type

Required

Description

state

String

Yes

The current lifecycle state of the task:

  • submitted: The agent has received and confirmed the task, but processing has not started.

  • working: The agent is processing the task. The client can expect further updates or a final state.

  • completed: The task is successfully completed.

  • failed: An error occurred during processing.

  • input-required: The agent requires additional input from the user to continue. The multimodal interaction service will then pass the user's next input to this agent.

  • rejected: The task is terminated. The multimodal interaction service will then take over the processing of the current request.

timestamp

String

No

The timestamp when this status was recorded. UTC time is recommended.

Artifact

Field Name

Type

Required

Description

artifactId

String

Yes

The identifier for the result generated by the agent.

parts

Part[]

Yes

The content of the result. It must contain at least one part.

metadata

Map<String, Object>

No

Metadata associated with this message.

TaskStatusUpdateEvent

Field Name

Type

Required

Description

taskId

String

Yes

The ID of the task being updated.

contextId

String

Yes

The context ID of the associated task.

kind

String

Yes

Fixed value: status-update.

status

TaskStatus

Yes

The new TaskStatus state.

final

Boolean

No

If true, this is the final status update for the current stream loop. The server usually closes the SSE connection afterward.

TaskArtifactUpdateEvent

Field Name

Type

Required

Description

taskId

String

Yes

The task ID associated with the generated result part.

contextId

String

Yes

The context ID of the associated task.

kind

String

Yes

Fixed value: artifact-update.

artifact

Artifact

Yes

The result data. It can be a complete result or an incremental result.

append

Boolean

No

If true, this part is appended to the returned result. If false (default), it replaces the returned result.

lastChunk

Boolean

No

If true, this is the final update for this result.

JSONRpcError

Field name

Type

Required

Description

code

Integer

Yes

The error code.

message

String

Yes

A description of the error.

Examples

HTTP response
{
  "id": "request-1",
  "jsonrpc": "2.0",
  "result": {
    "id": "task-1",
    "contextId": "context-1",
    "kind": "task",
    "status": {
      "state": "completed",
      "timestamp": "2025-07-15T14:50:28.575338Z"
    },
    "artifacts": [
      {
        "artifactId": "c3fee4d5-7234-48a1-8d2c-cfb715c5ce9e",
        "parts": [
          {
            "kind": "text",
            "text": "The weather is sunny today, "
          }
        ]
      },
      {
        "artifactId": "c3fee4d5-7234-48a1-8d2c-cfb715c5ce9e",
        "parts": [
          {
            "kind": "text",
            "text": "no rain."
          }
        ]
      }
    ]
  }
}
HTTP SSE response
{
  "id": "request-1",
  "jsonrpc": "2.0",
  "result": {
    "id": "task-1",
    "contextId": "context-1",
    "kind": "task",
    "status": {
      "state": "submitted",
      "timestamp": "2025-07-15T14:52:28.277547Z"
    }
  }
}

{
  "id": "request-1",
  "jsonrpc": "2.0",
  "result": {
    "taskId": "task-1",
    "contextId": "context-1",
    "kind": "artifact-update",
    "artifact": {
      "artifactId": "82eb84b9-0d73-4072-95f8-03655adfbf25",
      "parts": [
        {
          "kind": "text",
          "text": "The weather is sunny today, "
        }
      ]
    },
    "append": true,
    "lastChunk": false
  }
}

{
  "id": "request-1",
  "jsonrpc": "2.0",
  "result": {
    "taskId": "task-1",
    "contextId": "context-1",
    "kind": "artifact-update",
    "artifact": {
      "artifactId": "82eb84b9-0d73-4072-95f8-03655adfbf25",
      "parts": [
        {
          "kind": "text",
          "text": "no rain."
        }
      ]
    },
    "append": true,
    "lastChunk": true
  }
}

{
  "id": "request-1",
  "jsonrpc": "2.0",
  "result": {
    "taskId": "task-1",
    "contextId": "context-1",
    "kind": "status-update",
    "status": {
      "state": "completed",
      "timestamp": "2025-07-15T14:52:28.277643Z"
    },
    "final": true
  }
}

Integration example

To test the integration process for a self-developed agent, enter https://example/.well-known/agent.json in the console:

image

Further integration

You have now completed the basic integration of your self-developed agent with the multimodal interaction suite. For further integration, see the following topics: