In autonomous driving systems, vehicle image analysis is a core use case of the perception stack. It parses visual signals from in-vehicle and external cameras in real time, enabling the system to accurately understand its surroundings and make decisions. By analyzing and searching 100,000 images across diverse driving areas, geographies, environments, and weather conditions, this solution validates a full-stack technique for trajectory analysis, perception optimization, and more accurate pedestrian and vehicle recognition. This enhances the system's adaptability to complex traffic scenes, increases safety, and improves the user experience.
Core capabilities
This best practice for image processing and multimodal search and analysis involves the following core capabilities:
-
Model Studio: Alibaba Cloud Model Studio is a one-stop platform for large model development and application. It integrates Tongyi Qianwen and mainstream third-party models, providing developers with OpenAI-compatible APIs and a full-stack model service suite. In this solution, Hologres calls its models directly.
-
Object Table: Read unstructured data, such as PDFs, images, and PPTs, from Object Storage Service (OSS) in a table format.
-
AI Function: In Hologres, you can use standard SQL to call functions that automatically invoke built-in large models to build AI services.
-
Data processing: Provides Embed and Chunk operators to process unstructured data into structured data for storage. You can automatically generate embeddings without external algorithms.
-
Data search and analysis: Provides operators such as
ai_genandai_summarizeto perform inference, summarization, and translation on data using only SQL.
-
-
Dynamic Table: Supports an incremental refresh mode to automatically process unstructured data. This mode processes only incremental data to reduce redundant computation and lower resource usage.
-
HGraph index (Recommended): Enables vector search using standard SQL for similarity searches and scene recognition on unstructured data. You can freely combine vector and scalar searches in the same query.
-
Full-text inverted index: Enables efficient search of unstructured data through mechanisms such as inverted indexing and tokenization. It supports rich search methods, such as keyword matching and phrase searches, for more flexible retrieval.
Benefits
With these core capabilities, Hologres provides the following key advantages for image search:
-
End-to-end AI data processing: Covers the entire workflow from data embedding and chunking to incremental processing and search and analysis, making building AI applications as easy as working with big data systems.
-
Process image data with standard SQL: You do not need a specialized programming language. Use pure SQL to extract and process image data.
-
A single platform for cross-modal search: Supports both text-to-image search and image-to-image search. Semantic understanding goes beyond keyword limits, enabling cross-modal search within Hologres.
-
More accurate, flexible, and intelligent search: Easily build hybrid search pipelines that combine keyword, semantic, and multimodal search to cover the full range of needs from precise lookups to intent-based discovery. You can also use AI Function for deep user intent understanding, semantic correlation, and contextual reasoning to enable more intelligent searches.
-
Higher security with in-database processing: You do not need to export data to external systems. This solution seamlessly integrates with the various security features of Hologres to efficiently protect your data.
How it works
The workflow for this solution is as follows:
-
Prepare the dataset.
Upload the image data to Object Storage Service (OSS).
-
Process the images.
Use an Object Table to read the image metadata, and then create a Dynamic Table with incremental refresh to generate embeddings. Build a vector index for the Dynamic Table to accelerate subsequent searches.
-
Use the
ai_embedoperator to embed natural language queries, and then use vector search to retrieve the top N results.
Prerequisites
-
Data preparation
This topic uses the
val.zipfile from the publicly available BDD100K autonomous driving image dataset on ModelScope to simulate real driving data from multiple vehicles. -
Environment preparation
-
Obtain an API key from Model Studio. For more information, see the Model Studio platform documentation.
-
You must have a Hologres instance of V4.1 or later.
-
Ensure a Model Studio model is deployed. In the Hologres console, deploy the qwen3-vl-embedding model from Model Studio. For more information, see Model Studio Models.
-
Procedure
-
Download the image data and upload it to OSS.
-
Download the
val.zipfile from the BDD100K autonomous driving image dataset. -
Log on to the OSS console, create a bucket, and upload the downloaded
val.zipfile to the bucket. For more information on uploading files, see Simple upload.NoteUse lowercase for folder names.
-
-
Grant permissions.
-
Log on to the Resource Access Management (RAM) console and create an Alibaba Cloud RAM role with OSS read permissions.
We recommend granting the AliyunOSSReadOnlyAccess permission.
-
Add logon and Hologres access permissions to the RAM role.
-
Alibaba Cloud account (primary account)
Update the RAM role trust policy. Update the following parameters:
-
Action: Set to
sts:AssumeRole. -
Service: Set to
hologres.aliyuncs.com.
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::1866xxxx:root" ], "Service": [ "hologres.aliyuncs.com" ] } } ], "Version": "1" } -
-
RAM user (sub-account)
-
Grant permissions to the RAM user.
-
On the page, click Create permission policy and select Script editor mode to create a policy. For details, see Create a custom policy.
Hologres uses this policy to determine whether the RAM user has permission to create the corresponding RAM role. Use the following policy document:
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "hologram:GrantAssumeRole", "Resource": "<RoleARN>" } ] } -
On the page, click Add permissions in the Actions column for the target RAM user to assign the policy created above. For details, see Manage RAM user permissions.
-
-
Grant permissions to the created RAM role.
Update the RAM role trust policy. Update the following parameters:
-
Action: Set to
sts:AssumeRole. -
Service: Set to
hologres.aliyuncs.com.
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::1866xxxx:root" ], "Service": [ "hologres.aliyuncs.com" ] } } ], "Version": "1" } -
-
-
-
-
Generate embeddings for the images.
Create an Object Table and a Dynamic Table to read image metadata and generate embeddings for the images. Hologres provides this lengthy process as a stored procedure. This stored procedure includes the following capabilities:
-
Creates an Object Table to read image metadata.
-
Creates an incrementally refreshed Dynamic Table as a result table to store the processed data and creates a vector index. Auto-refresh is disabled for this Dynamic Table, so you must refresh it manually.
-
During the refresh process, this Dynamic Table uses the
ai_embedfunction to generate image embeddings.
To use this stored procedure, run the following statement:
-- Stored procedure to create an Object Table and a Dynamic Table, and use the Dynamic Table to embed images. CALL create_image_table_from_oss( oss_path => 'oss://xxxx/bdd100k/val/images', oss_endpoint => 'oss-cn-hangzhou-internal.aliyuncs.com', oss_role_arn => 'acs:ram::1xxxx:role/xxxx', image_table => 'public.dt_image_bdd100k', embedding_model =>'qwen3_vl_embedding' ); -
-
Refresh the result table.
You must manually refresh the Object Table and Dynamic Table from the preceding step to complete data processing. This step is also encapsulated into a stored procedure. This stored procedure provides the following capabilities:
-
Refreshes the Object Table once to obtain image metadata.
-
Refreshes the Dynamic Table once to generate embeddings for the images.
To use this stored procedure, run the following statement:
-- Refresh the Dynamic Table to generate embeddings for the images. CALL refresh_image_table( image_table => 'public.dt_image_bdd100k' ); -
-
Search the images.
After the image data is processed, you can search using vector search and AI Function.
Text-to-image search
The following SQL statement is an example of a text-to-image search:
-- Text-to-image search SELECT object_uri, approx_cosine_distance (embedding_vector, ai_embed ('qwen3_vl_embedding', 'a red car in the rain')) AS score FROM public.dt_image_bdd100k ORDER BY score DESC LIMIT 1; object_uri | score ---------------------------------------------------------------+------- oss://****/bdd100k/val/images/c11a0015-fc7c****.jpg| 0.559552908 (5 rows)On the OSS object details page, the search result shows a rainy street scene shot through a car window. You can see raindrops, traffic lights, moving vehicles, and a crosswalk. The scene is a strong semantic match for the query text
a red car in the rain. The object metadata shows an expiration time of 300 seconds.Image-to-image search
The following SQL statement is an example of an image-to-image search:
-- Image-to-image search SELECT object_uri, approx_cosine_distance (embedding_vector, ai_embed ('qwen3_vl_embedding', to_file ('oss://xxxx/val/images/b9b53753-91a5d5f8.jpg', 'oss-cn-hangzhou-internal.aliyuncs.com', 'acs:ram::18xxx:role/xxx'))) AS score FROM public.dt_image_bdd100k WHERE object_uri <> 'oss://****-hangzhou/bdd100k/val/images/b9b53753-91a5****.jpg' -- Exclude the image itself. ORDER BY score DESC LIMIT 1; object_uri | score ---------------------------------------------------------------+------ oss://****/bdd100k/val/images/c0e9b7c4-cd8b****.jpg | 0.811107099A comparison shows that the query image (a rainy urban driving scene) and the retrieved image (an urban street scene with a white sedan, a bus, and a taxi) have high visual similarity. This validates the accuracy of the image-to-image search result.
Appendix: Stored procedures
-
Create an Object Table and a Dynamic Table
-- By default, a query returns a maximum of 200 rows. To return more data, modify the LIMIT clause. The maximum is 10,000 rows or 20 MB. CREATE OR REPLACE PROCEDURE create_image_table_from_oss( oss_path TEXT, oss_endpoint TEXT, oss_role_arn TEXT, image_table TEXT, embedding_model TEXT DEFAULT NULL, overwrite BOOLEAN DEFAULT FALSE ) AS $$ DECLARE image_schema_name TEXT; image_table_name TEXT; obj_table_name TEXT; full_image_table_ident TEXT; full_obj_ident TEXT; embed_expr TEXT; create_sql TEXT; embedding_dims INT; BEGIN -- 1. Split the schema name and table name. IF position('.' in image_table) > 0 THEN image_schema_name := split_part(image_table, '.', 1); image_table_name := split_part(image_table, '.', 2); ELSE image_schema_name := 'public'; image_table_name := image_table; END IF; obj_table_name := image_table_name || '_obj_table'; full_image_table_ident := format('%I.%I', image_schema_name, image_table_name); full_obj_ident := format('%I.%I', image_schema_name, obj_table_name); -- 2. If overwrite is true, drop the tables and indexes first. IF overwrite THEN DECLARE dyn_table_exists BOOLEAN; rec RECORD; BEGIN -- Check whether the dynamic table exists. SELECT EXISTS ( SELECT 1 FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relname = image_table_name AND n.nspname = image_schema_name ) INTO dyn_table_exists; IF dyn_table_exists THEN -- 2.1 Disable auto-refresh for the dynamic table. -- RAISE NOTICE 'Disabling auto refresh for %', full_image_table_ident; -- EXECUTE format('ALTER TABLE IF EXISTS %s SET (auto_refresh_enable=false)', full_image_table_ident); -- 2.2 Find and cancel the RUNNING refresh job. FOR rec IN EXECUTE format( $f$ SELECT query_job_id FROM hologres.hg_dynamic_table_refresh_log(%L) WHERE status = 'RUNNING'; $f$, image_table ) LOOP RAISE NOTICE 'Found running refresh job: %', rec.query_job_id; IF hologres.hg_internal_cancel_query_job(rec.query_job_id::bigint) THEN RAISE NOTICE 'Cancel job % succeeded.', rec.query_job_id; ELSE RAISE WARNING 'Cancel job % failed.', rec.query_job_id; END IF; END LOOP; -- 2.3 Drop the Dynamic Table. EXECUTE format('DROP TABLE IF EXISTS %s;', full_image_table_ident); ELSE RAISE NOTICE 'Dynamic table % does not exist, skip cancel job and drop.', full_image_table_ident; END IF; -- 2.4 Drop the Object Table. EXECUTE format('DROP OBJECT TABLE IF EXISTS %s;', full_obj_ident); END; END IF; -- 3. Create the Object Table. RAISE NOTICE 'Create object table: %', obj_table_name; EXECUTE format( $f$ CREATE OBJECT TABLE %s WITH ( path = %L, oss_endpoint = %L, role_arn = %L ); $f$, full_obj_ident, oss_path, oss_endpoint, oss_role_arn ); COMMIT; -- 4. Refresh the Object Table. RAISE NOTICE 'Refresh object table: %', obj_table_name; EXECUTE format('REFRESH OBJECT TABLE %s;', full_obj_ident); COMMIT; -- 5. Select an embedding model. IF embedding_model IS NULL OR length(trim(embedding_model)) = 0 THEN embed_expr := 'ai_embed(file)'; EXECUTE 'SELECT array_length(ai_embed(''dummy''), 1)' INTO embedding_dims; ELSE embed_expr := format('ai_embed(%L, file)', embedding_model); EXECUTE format( 'SELECT array_length(ai_embed(%L, ''dummy''), 1)', embedding_model ) INTO embedding_dims; END IF; RAISE NOTICE 'embedding dimension is: %', embedding_dims; -- 6. Create the Dynamic Table for RAG output. RAISE NOTICE 'create dynamic table: %', image_table_name; EXECUTE format( $f$ CREATE DYNAMIC TABLE %s( CHECK(array_ndims(embedding_vector) = 1 AND array_length(embedding_vector, 1) = %s) ) WITH ( vectors = '{ "embedding_vector": { "algorithm": "HGraph", "distance_method": "Cosine", "builder_params": { "base_quantization_type": "sq8_uniform", "max_degree": 64, "ef_construction": 400, "precise_quantization_type": "fp32", "use_reorder": true } } }', auto_refresh_mode = 'incremental', freshness = '5 minutes', auto_refresh_enable = 'false' ) AS SELECT object_uri, etag, %s AS embedding_vector FROM %s; $f$, full_image_table_ident, embedding_dims, embed_expr, obj_table_name ); COMMIT; RAISE NOTICE ''; RAISE NOTICE 'Create image table success: %', image_table; RAISE NOTICE ' Vector index is: %.embedding_vector', image_table; END; $$ LANGUAGE plpgsql; -
Refresh the Object Table and Dynamic Table
CREATE OR REPLACE PROCEDURE refresh_image_table( image_table TEXT ) AS $$ DECLARE image_schema_name TEXT; image_table_name TEXT; obj_table_name TEXT; full_image_table_ident TEXT; full_obj_ident TEXT; BEGIN -- 1. Parse the schema and table name. IF position('.' in image_table) > 0 THEN image_schema_name := split_part(image_table, '.', 1); image_table_name := split_part(image_table, '.', 2); ELSE image_schema_name := 'public'; image_table_name := image_table; END IF; obj_table_name := image_table_name || '_obj_table'; full_image_table_ident := format('%I.%I', image_schema_name, image_table_name); full_obj_ident := format('%I.%I', image_schema_name, obj_table_name); -- 2. Refresh the Object Table. RAISE NOTICE 'Refreshing Object Table: %', obj_table_name; EXECUTE format('REFRESH OBJECT TABLE %s;', full_obj_ident); -- 3. Refresh the Dynamic Table. RAISE NOTICE 'Refreshing Dynamic Table: %', image_table_name; EXECUTE format('REFRESH TABLE %s;', full_image_table_ident); RAISE NOTICE 'Refresh image table complete: %', image_table; END; $$ LANGUAGE plpgsql; -
Procedure for deleting tables
CREATE OR REPLACE PROCEDURE drop_image_table( image_table TEXT ) AS $$ DECLARE image_schema_name TEXT; image_table_name TEXT; obj_table_name TEXT; full_image_table_ident TEXT; full_obj_ident TEXT; rec RECORD; BEGIN -- 1. Parse the schema and table name. IF position('.' in image_table) > 0 THEN image_schema_name := split_part(image_table, '.', 1); image_table_name := split_part(image_table, '.', 2); ELSE image_schema_name := 'public'; image_table_name := image_table; END IF; obj_table_name := image_table_name || '_obj_table'; full_image_table_ident := format('%I.%I', image_schema_name, image_table_name); full_obj_ident := format('%I.%I', image_schema_name, obj_table_name); -- 2. Delete the table. -- 2.1 Disable auto-refresh for the dynamic table. -- RAISE NOTICE 'Disabling auto refresh for %', full_image_table_ident; -- EXECUTE format('ALTER TABLE IF EXISTS %s SET (auto_refresh_enable=false)', full_image_table_ident); -- 2.2 Find and cancel the RUNNING refresh job. FOR rec IN EXECUTE format( $f$ SELECT query_job_id FROM hologres.hg_dynamic_table_refresh_log(%L) WHERE status = 'RUNNING'; $f$, image_table ) LOOP RAISE NOTICE 'Found running refresh job: %', rec.query_job_id; IF hologres.hg_internal_cancel_query_job(rec.query_job_id::bigint) THEN RAISE NOTICE 'Cancel job % succeeded.', rec.query_job_id; ELSE RAISE WARNING 'Cancel job % failed.', rec.query_job_id; END IF; END LOOP; -- 2.3 Drop the Dynamic Table. RAISE NOTICE 'Dropping Dynamic Table: %', image_table_name; EXECUTE format('DROP TABLE IF EXISTS %s;', full_image_table_ident); -- 2.4 Drop the Object Table. RAISE NOTICE 'Dropping Object Table: %', obj_table_name; EXECUTE format('DROP OBJECT TABLE IF EXISTS %s;', full_obj_ident); RAISE NOTICE 'Drop image table complete: %', image_table; END; $$ LANGUAGE plpgsql;