Visual search & ranking
The following parameters are supported by the /select
, /image
, /duplicate
and /color
endpoints.
Ranking mode
The rank.mode
parameter controls which similarity algorithm is used for ranking.
Supported values are content
, color
and duplicate
.
rank.mode=[content | color | duplicate]
Setting rank.mode
requires rank.by.*
to be set as well.
Rank by image content
Set rank.mode=content
to find images with same objects as in the input image or (if this is not possible) with similar high-level concepts. It uses our general purpose AI model which is designed to work with arbitrary image collections out of the box.
Rank by colors
rank.mode=color
uses the dominant colors of the input image to find images with a similar color palette.
Accent colors are emphasized, making it possible to find images that contain little of an accent color.
This leads to greater variability of search results.
Rank by duplicate
rank.mode=duplicate
uses an AI model specialized in identifying duplicate images, even when they have undergone various modifications.
These modifications include changes to saturation and color, alterations in size, and conversion to a different image format.
The model can also identify rotated images, blurred images, cropped images, and images with added text. Furthermore, the model is able to recognize images with JPEG compression artifacts and partial copies, where a part of an image is included in a composite image.
The AI model is trained to clearly separate duplicate results from non-duplicate results, so you can set a rank.threshold
that is as consistent and certain as possible.
Note, the duplicate ranking mode is built for finding various versions of the same image. If you intend to find identical objects, use rank.mode=content
instead.
Using the /duplicate
endpoint is the same as using /select
with rank.mode=duplicate
and rank.treshold=0.7
Ranking input
The rank.by.*
parameters define the search input used for ranking the indexed images. The various input types are explained below.
Search by image ID
Search by an indexed image for similar ones in your collection.
rank.by.id=[doc-id]
If the given doc-id
does not exist in the index, an empty result is returned.
Search by image URL
Search by an image URL for similar images in your collection.
rank.by.url=[image-url]
Supported protocols are HTTP
, HTTPS
and FILE
. Supported image types are jpg
, png
and gif
.
Use percent-encoded URLs
When sending URLs as part of GET
parameters it is necessary to properly percent-encode them to ensure correct parsing.
Search by image upload
Upload a local image as search input. The image file is sent as Base64 encoded string (also known as Data-URI).
rank.by.data=[base64-encoded-file]
Use POST
request to be able to upload larger files without hitting URI length limits. Send the rank.by.data
parameter in POST
body using Content-Type: application/x-www-form-urlencoded
.
Supported image types are jpg
, png
and gif
.
Avoid log pollution
By default, all query parameters are logged (including the Base64-encoded file).
To avoid log pollution, set the logParamsList=q,fq
to only log the defined query parameters.
Search by color
Search by one or more colors and retrieve images with similar dominant colors.
This search input type requires rank.mode=color
.
rank.by.hex=[0xRRGGBB, ...]
Colors are hexadecimal encoded (0xRRGGBB
, e.g.0xFF0000
for red). You may query with multiple comma separated colors.
Each color is weighted equally.
Relevance threshold
Only return results with a minimum visual score equals or higher than a threshold.
Default is rank.threshold=0
(no threshold).
rank.threshold=[0-1]
When doing a visual search the best matching images are returned, even if there are no relevant images in the index. With rank.threshold
all images with a visual score lower than the threshold are discarded and do not count as match.
Approximate search
Smartfilter
Speed up visual search by filtering out images that are likely to be irrelevant to the visual search query.
rank.smartfilter=[off | low | medium | high | ultra | extreme]
Searches are done in a two step process:
- collect relevant documents (number of found docs) and
- rank these documents.
Ranking in visual searches is a compute intensive task and query time strongly depends on the number of documents that need to be scored.
The Smartfilter ignores images that are likely to be irrelevant to the visual search query, thus reducing the number of documents that have to scored. You can control the filter level and thus performance gain and quality loss.
Read more about exact vs. approximate search.
Approximate ranking
The rank.approximate
parameter enables faster searches in large datasets, by applying approximate filtering and approximate scoring techniques.
rank.approximate=[true | false]
If rank.approximate=true
, then rank.smartfilter
must be activated as well, since it is part of the approximate search process. Read more about exact vs. approximate search.
Our search recommendation
- Use exact search for up to 200,000 docs (100% accuracy)
- Use
rank.approximate=true
withrank.smartfilter=high
from 200,000 docs and more (93% accuracy)
rank.threshold
is applied on approximate score
The threshold using rank.threshold
is applied based on the approximate score
and before the exact score is calculated. Since the exact score can change the final result, some results may be below the threshold and some images may be filtered out even though their exact score would be above the threshold.
Weight textual and visual relevance
Control the weight of textual and visual relevance in the resulting score when doing visual searches. The default is 1.0
(visual relevance only).
rank.weighting=[0-1]
When doing visual searches, two scores are calculated internally and merged to one resulting score: a textual score based on the textual search parameters like keywords and so on and a visual score based on the rank.by.*
parameter.
With rank.weighting
you can control how both scores are merged into the final score whether to focus on textual or visual relevance. The following table displays the effect of different values:
Parameter value | description |
---|---|
0.0 |
Textual scoring only |
0.1 - 0.9 |
Textual and visual scores are combined according to the specific weighting |
1.0 (default) |
Visual scoring only |