Visual search & ranking

The following parameters are supported by the /select, /image, /duplicate and /color endpoints.

Ranking mode

The rank.mode parameter controls which similarity algorithm is used for ranking. Supported values are content, color and duplicate.

rank.mode=[content | color | duplicate]

Setting rank.mode requires rank.by.* to be set as well.

Rank by image content

Set rank.mode=content to find images with same objects as in the input image or (if this is not possible) with similar high-level concepts. It uses our general purpose AI model which is designed to work with arbitrary image collections out of the box.

Example results using content search mode.

Using the /image endpoint is the same as using /select with rank.mode=content

Rank by colors

rank.mode=color uses the dominant colors of the input image to find images with a similar color palette. Accent colors are emphasized, making it possible to find images that contain little of an accent color. This leads to greater variability of search results.

Using the /color endpoint is the same as using /select with rank.mode=color

Rank by duplicate

rank.mode=duplicate uses an AI model specialized in identifying duplicate images, even when they have undergone various modifications. These modifications include changes to saturation and color, alterations in size, and conversion to a different image format.

The model can also identify rotated images, blurred images, cropped images, and images with added text. Furthermore, the model is able to recognize images with JPEG compression artifacts and partial copies, where a part of an image is included in a composite image.

The AI model is trained to clearly separate duplicate results from non-duplicate results, so you can set a rank.threshold that is as consistent and certain as possible.

Example results searching for duplicates.

Note, the duplicate ranking mode is built for finding various versions of the same image. If you intend to find identical objects, use rank.mode=content instead.

Using the /duplicate endpoint is the same as using /select with rank.mode=duplicate and rank.treshold=0.7

Ranking input

The rank.by.* parameters define the search input used for ranking the indexed images. The various input types are explained below.

Search by image ID

Search by an indexed image for similar ones in your collection.

rank.by.id=[doc-id]

If the given doc-id does not exist in the index, an empty result is returned.

Search by image URL

Search by an image URL for similar images in your collection.

rank.by.url=[image-url]

Supported protocols are HTTP, HTTPS and FILE. Supported image types are jpg, png and gif.

Use percent-encoded URLs

When sending URLs as part of GET parameters it is necessary to properly percent-encode them to ensure correct parsing.

Search by image upload

Upload a local image as search input. The image file is sent as Base64 encoded string (also known as Data-URI).

rank.by.data=[base64-encoded-file]

Use POST request to be able to upload larger files without hitting URI length limits. Send the rank.by.data parameter in POST body using Content-Type: application/x-www-form-urlencoded. Supported image types are jpg, png and gif.

Avoid log pollution

By default, all query parameters are logged (including the Base64-encoded file). To avoid log pollution, set the logParamsList=q,fq to only log the defined query parameters.

Search by color

Search by one or more colors and retrieve images with similar dominant colors.
This search input type requires rank.mode=color.

rank.by.hex=[0xRRGGBB, ...]

Colors are hexadecimal encoded (0xRRGGBB, e.g.0xFF0000 for red). You may query with multiple comma separated colors. Each color is weighted equally.

Relevance threshold

Only return results with a minimum visual score equals or higher than a threshold. Default is rank.threshold=0 (no threshold).

rank.threshold=[0-1]

When doing a visual search the best matching images are returned, even if there are no relevant images in the index. With rank.threshold all images with a visual score lower than the threshold are discarded and do not count as match.

Approximate search

Smartfilter

Speed up visual search by filtering out images that are likely to be irrelevant to the visual search query.

rank.smartfilter=[off | low | medium | high | ultra | extreme]

Searches are done in a two step process:

collect relevant documents (number of found docs) and
rank these documents.

Ranking in visual searches is a compute intensive task and query time strongly depends on the number of documents that need to be scored.

The Smartfilter ignores images that are likely to be irrelevant to the visual search query, thus reducing the number of documents that have to scored. You can control the filter level and thus performance gain and quality loss.

Read more about exact vs. approximate search.

Approximate ranking

The rank.approximate parameter enables faster searches in large datasets, by applying approximate filtering and approximate scoring techniques.

rank.approximate=[true | false]

If rank.approximate=true, then rank.smartfilter must be activated as well, since it is part of the approximate search process. Read more about exact vs. approximate search.

Our search recommendation

Use exact search for up to 200,000 docs (100% accuracy)
Use rank.approximate=true with rank.smartfilter=high from 200,000 docs and more (93% accuracy)

rank.threshold is applied on approximate score

The threshold using rank.threshold is applied based on the approximate score and before the exact score is calculated. Since the exact score can change the final result, some results may be below the threshold and some images may be filtered out even though their exact score would be above the threshold.

Weight textual and visual relevance

Control the weight of textual and visual relevance in the resulting score when doing visual searches. The default is 1.0 (visual relevance only).

rank.weighting=[0-1]

When doing visual searches, two scores are calculated internally and merged to one resulting score: a textual score based on the textual search parameters like keywords and so on and a visual score based on the rank.by.* parameter.

With rank.weighting you can control how both scores are merged into the final score whether to focus on textual or visual relevance. The following table displays the effect of different values:

Parameter value	description
`0.0`	Textual scoring only
`0.1` - `0.9`	Textual and visual scores are combined according to the specific weighting
`1.0` (default)	Visual scoring only