Image-to-image search-Alibaba Cloud Model Studio(Model Studio)-阿里云帮助中心

The search by image tool enables the model to search the Internet for visually similar images based on an input image. The model can then analyze the search results and make inferences. This feature is useful for scenarios such as finding similar products or tracing the origin of visual content.

How to use

You can call the search by image feature through the Responses API. Add the image_search tool to the tools parameter and pass the image in the input parameter in multimodal format.

input parameter must contain image content. Pass the image URL using the input_image type. You can also pass text using the input_text type to provide additional information for the search.

# Import dependencies and create a client...
input_content = [
    {"type": "input_text", "text": "Find landscape images with a style similar to this one"},
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]
response = client.responses.create(
    model="qwen3.7-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[{"type": "image_search"}]
)

print(response.output_text)

Supported models

Recommended models

For optimal tool calling performance, use the following models:

Qwen-Plus: Qwen3.7-Plus series, Qwen3.6-Plus series, Qwen3.5-Plus series

Qwen-Max: qwen3.7-max-2026-06-08

Other models

The following models also support this tool, but their performance is not as high as that of the recommended models.

Qwen-Flash: Qwen3.6-Flash series, Qwen3.5-Flash series
Qwen3.6 open-source series (except qwen3.6-27b)
Qwen3.5 open source series

This feature can only be called through the Responses API.

Getting started

You can run the following code to call the search by image tool through the Responses API and search for similar or related images based on the input image.

Before you start, obtain an API key and configure the API key as an environment variable.

Replace the image_url in the sample code with a publicly accessible image URL.

Python

import os
import json
from openai import OpenAI

client = OpenAI(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
)

input_content = [
    {"type": "input_text", "text": "Find landscape images with a style similar to this one"},
    # Replace image_url with the actual public URL of the image
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]

response = client.responses.create(
    model="qwen3.7-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[
        {
            "type": "image_search"
        }
    ]
)

# Traverse the output to display each step
for item in response.output:
    if item.type == "image_search_call":
        print(f"[Tool Call] Search by image (status: {item.status})")
        # Parse and display the list of searched images
        if item.output:
            images = json.loads(item.output)
            print(f"  Found {len(images)} images:")
            for img in images[:5]:  # Display the first 5 images
                print(f"  [{img['index']}] {img['title']}")
                print(f"      {img['url']}")
            if len(images) > 5:
                print(f"  ... {len(images)} images in total")
    elif item.type == "message":
        print(f"\n[Model Response]")
        print(response.output_text)

# Display token usage and tool call statistics
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
if hasattr(response.usage, 'x_tools') and response.usage.x_tools:
    for tool_name, info in response.usage.x_tools.items():
        print(f"[Tool Statistics] {tool_name} call count: {info.get('count', 0)}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    // The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: [
            {
                role: "user",
                content: [
                    { type: "input_text", text: "Find landscape images with a style similar to this one" },
                    // Replace image_url with the actual public URL of the image
                    { type: "input_image", image_url: "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png" }
                ]
            }
        ],
        tools: [
            { type: "image_search" }
        ]
    });

    // Traverse the output to display each step
    for (const item of response.output) {
        if (item.type === "image_search_call") {
            console.log(`[Tool Call] Search by image (status: ${item.status})`);
            // Parse and display the list of searched images
            if (item.output) {
                const images = JSON.parse(item.output);
                console.log(`  Found ${images.length} images:`);
                images.slice(0, 5).forEach(img => {
                    console.log(`  [${img.index}] ${img.title}`);
                    console.log(`      ${img.url}`);
                });
                if (images.length > 5) {
                    console.log(`  ... ${images.length} images in total`);
                }
            }
        } else if (item.type === "message") {
            console.log(`\n[Model Response]`);
            console.log(response.output_text);
        }
    }

    // Display token usage and tool call statistics
    console.log(`\n[Token Usage] Input: ${response.usage.input_tokens}, Output: ${response.usage.output_tokens}, Total: ${response.usage.total_tokens}`);
    if (response.usage && response.usage.x_tools) {
        for (const [toolName, info] of Object.entries(response.usage.x_tools)) {
            console.log(`[Tool Statistics] ${toolName} call count: ${info.count || 0}`);
        }
    }
}

main();

curl

# The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": [
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "Find landscape images with a style similar to this one"},
                {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
            ]
        }
    ],
    "tools": [
        {"type": "image_search"}
    ]
}'

After you run the code, you receive a response similar to the following one:

[Tool Call] Search by image (status: completed)
  Found 2 images:
  [1] QingMing Festival Holiday Notice 2024
      https://www.healthcabin.net/blog/wp-content/uploads/2024/04/QingMing-Festival-Holiday-Notice-2024.jpg
  [2] Serene Asian Landscape Stone Bridge Reflecting in Misty Water
      https://thumbs.dreamstime.com/b/serene-asian-landscape-stone-bridge-reflecting-misty-water-tranquil-illustration-traditional-arch-spanning-lake-style-376972039.jpg

[Model Response]
OK. I have found several landscape images with a similar style.

These images all display the artistic conception of typical Chinese ink wash paintings or traditional landscape paintings, and they share the following common points:
*   **Traditional architecture**: such as pavilions, towers, and arch bridges.
*   **Natural elements**: such as distant mountains, lakes, weeping willows, and lotus flowers.
*   **Artistic style**: uses elegant colors and soft lines to create a quiet and serene atmosphere.

...

[Token Usage] Input: 2753, Output: 181, Total: 2934
[Tool Statistics] image_search call count: 1

Streaming output

The search by image tool has a long processing time. You can enable streaming output to retrieve intermediate results in real time.

Python

import os
import json
from openai import OpenAI

client = OpenAI(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
)

input_content = [
    {"type": "input_text", "text": "Find landscape images with a style similar to this one"},
    # Replace image_url with the actual public URL of the image
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]

stream = client.responses.create(
    model="qwen3.7-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[{"type": "image_search"}],
    stream=True
)

for event in stream:
    # Tool call starts
    if event.type == "response.output_item.added":
        if event.item.type == "image_search_call":
            print("[Tool Call] Searching by image...")
    # Tool call is complete. Parse and display the list of searched images.
    elif event.type == "response.output_item.done":
        if event.item.type == "image_search_call":
            print(f"[Tool Call] Search by image complete (status: {event.item.status})")
            if event.item.output:
                images = json.loads(event.item.output)
                print(f"  Found {len(images)} images:")
                for img in images[:5]:  # Display the first 5 images
                    print(f"  [{img['index']}] {img['title']}")
                    print(f"      {img['url']}")
                if len(images) > 5:
                    print(f"  ... {len(images)} images in total")
    # Model response starts
    elif event.type == "response.content_part.added":
        print(f"\n[Model Response]")
    # Streamed text output
    elif event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)
    # Response is complete. Output the usage.
    elif event.type == "response.completed":
        usage = event.response.usage
        print(f"\n\n[Token Usage] Input: {usage.input_tokens}, Output: {usage.output_tokens}, Total: {usage.total_tokens}")
        if hasattr(usage, 'x_tools') and usage.x_tools:
            for tool_name, info in usage.x_tools.items():
                print(f"[Tool Statistics] {tool_name} call count: {info.get('count', 0)}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    // The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.7-plus",
        input: [
            {
                role: "user",
                content: [
                    { type: "input_text", text: "Find landscape images with a style similar to this one" },
                    // Replace image_url with the actual public URL of the image
                    { type: "input_image", image_url: "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png" }
                ]
            }
        ],
        tools: [{ type: "image_search" }],
        stream: true
    });

    for await (const event of stream) {
        // Tool call starts
        if (event.type === "response.output_item.added") {
            if (event.item.type === "image_search_call") {
                console.log("[Tool Call] Searching by image...");
            }
        }
        // Tool call is complete. Parse and display the list of searched images.
        else if (event.type === "response.output_item.done") {
            if (event.item && event.item.type === "image_search_call") {
                console.log(`[Tool Call] Search by image complete (status: ${event.item.status})`);
                if (event.item.output) {
                    const images = JSON.parse(event.item.output);
                    console.log(`  Found ${images.length} images:`);
                    images.slice(0, 5).forEach(img => {
                        console.log(`  [${img.index}] ${img.title}`);
                        console.log(`      ${img.url}`);
                    });
                    if (images.length > 5) {
                        console.log(`  ... ${images.length} images in total`);
                    }
                }
            }
        }
        // Model response starts
        else if (event.type === "response.content_part.added") {
            console.log(`\n[Model Response]`);
        }
        // Streamed text output
        else if (event.type === "response.output_text.delta") {
            process.stdout.write(event.delta);
        }
        // Response is complete. Output the usage.
        else if (event.type === "response.completed") {
            const usage = event.response.usage;
            console.log(`\n\n[Token Usage] Input: ${usage.input_tokens}, Output: ${usage.output_tokens}, Total: ${usage.total_tokens}`);
            if (usage && usage.x_tools) {
                for (const [toolName, info] of Object.entries(usage.x_tools)) {
                    console.log(`[Tool Statistics] ${toolName} call count: ${info.count || 0}`);
                }
            }
        }
    }
}

main();

curl

# The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": [
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "Find landscape images with a style similar to this one"},
                {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
            ]
        }
    ],
    "tools": [
        {"type": "image_search"}
    ],
    "stream": true
}'

After you run the code, you receive a response similar to the following one:

[Tool Call] Searching by image...
[Tool Call] Search by image complete (status: completed)
  Found 3 images:
  [1] QingMing Festival Holiday Notice 2024
      https://www.healthcabin.net/blog/wp-content/uploads/2024/04/QingMing-Festival-Holiday-Notice-2024.jpg
  [2] Serene Asian Landscape Stone Bridge Reflecting in Misty Water
      https://thumbs.dreamstime.com/b/serene-asian-landscape-stone-bridge-reflecting-misty-water-...
  [3] ...

[Model Response]
OK. I have found several landscape images with a similar style. These images all display the style of typical Chinese ink wash or fine-brush paintings...

[Token Usage] Input: 5339, Output: 164, Total: 5503
[Tool Statistics] image_search call count: 1

Billing

Billing involves the following aspects:

Model call fees: The results from the image search are appended to the prompt. This increases the number of input tokens for the model. You are charged based on the model's standard rate. For pricing details, see the Model Studio console.
Tool call fees: You are charged per 1,000 calls. The fee is CNY 48 for deployments in the Chinese mainland and globally, while the fee for international deployments is CNY 58.713905.

FAQ

Q: What image formats and input methods are supported?

A: For more information, see Image limits and File input methods.

The OpenAI SDK does not support passing local file paths.

Q: How many images can be passed?

A: The number of images that you can pass is limited by the model's maximum input length. The total number of tokens for both the images and text must not exceed the maximum value that the model supports. The model searches only one image per call, but you can make multiple calls to process multiple images.

The model decides the number of images to search.

Q: How many searched images are returned?

A: The model determines the number of images to return. The quantity is not fixed, but the maximum is 100 images.