Named Entity Recognition (NER) identifies and categorizes meaningful spans in a user's search query—such as brands, locations, products, and people—so OpenSearch Industry Algorithm Edition can interpret semantic intent and return more relevant results. Instead of treating a query as a flat list of keywords, NER adds structured labels that improve both recall and ranking.
How it works
When a search query arrives, the query analysis pipeline runs NER before token matching. The NER model scans the raw query text, assigns each recognized span an entity category and a confidence score, and passes the structured output to downstream stages. Synonym configuration and relevance scoring then apply category-specific logic based on those labels.
For example, given the query "Nike running shoes Shanghai", NER identifies Nike as a Brand entity and Shanghai as a Location entity. The search engine can boost brand-matched results or filter by delivery region accordingly.
Supported entity categories
NER in OpenSearch Industry Algorithm Edition recognizes entities relevant to e-commerce and content search scenarios.
| Entity category | Description | Example |
|---|---|---|
| Brand | Manufacturer or brand name | Nike, Apple |
| Product | Specific product name or model | iPhone 15, Air Max 90 |
| Person | Name of an individual | Tom Hanks |
| Location | Geographic place or region | Shanghai, California |
| Organization | Company, institution, or group | Alibaba Group |
| Time | Date, time, or time range | this weekend, 2024 |
| Price | Monetary value or range | under 500, ¥200 |
The entity categories available to your application depend on your OpenSearch Industry Algorithm Edition configuration and the semantic model deployed in your instance. Contact your account team to confirm which categories are enabled.
Enable Named Entity Recognition
NER is part of the query analysis pipeline. Enable and configure it through the OpenSearch console when setting up your application's query analysis settings.
Prerequisites
Before you begin, ensure that you have:
An OpenSearch Industry Algorithm Edition application
Query analysis enabled for your application
A semantic model deployed that supports NER
Enable NER in query analysis
Log in to the OpenSearch console.
Navigate to your application and open the Query Analysis settings.
Enable Named Entity Recognition.
Select the entity categories relevant to your use case.
Set the minimum confidence threshold to filter low-confidence entity matches.
Save and publish your configuration.
After publishing, NER applies to all new queries. Existing queries are not retroactively reprocessed.
Entity output structure
Each entity recognized in a query produces a structured result with the following fields.
| Field | Type | Description |
|---|---|---|
text | string | The matched span from the original query |
category | string | The entity category (for example, Brand) |
confidence | float | Confidence score between 0 and 1 |
offset | integer | Start position of the matched span in the query string |
length | integer | Character length of the matched span |
Example: For the query "Nike running shoes", NER returns:
{
"entities": [
{
"text": "Nike",
"category": "Brand",
"confidence": 0.97,
"offset": 0,
"length": 4
}
]
}Confidence scores and thresholds
The confidence score reflects how certain the model is about an entity match. A score close to 1.0 indicates high confidence; a score below 0.5 suggests the match may be ambiguous.
Set the minimum confidence threshold based on the tradeoff between precision and recall:
| Threshold | Effect |
|---|---|
| High (for example, 0.8) | Fewer entities recognized; higher precision, lower recall |
| Low (for example, 0.4) | More entities recognized; higher recall, more noise |
Start with a threshold of 0.6 and adjust based on search quality metrics from your application.
Limitations
NER performance depends on the quality and coverage of the deployed semantic model. Queries that contain rare terms, typos, or mixed-language input may produce lower confidence scores or no entity matches.
NER processes the query text as submitted. It does not correct spelling errors before entity detection.
Entity categories not included in your semantic model configuration are not recognized, even if the query contains matching text.