Tag images

The following parameters are only supported by the /tag endpoint.

Tagging input

There are several ways to provide the image to tag.

Tag by upload

Upload and tag a Base64 encoded image file (also known as Data-URI).

tagging.input.data=[base64-encoded-image-file]

Use POST request to be able to upload larger files without hitting URI length limits. Send the tagging.input.data parameter in POST body. Supported image types are jpg, png and gif.

Tag by image URL

Tag an image referenced by URL.

tagging.input.url=[image-url]

Valid protocols are HTTPS / HTTP. The image-url must be RFC2396 compliant and point to a valid, accessible image ressource or an error will be returned.

Tag by preprocessed image

Tag an image using the preprocessed JSON returned by the /analyze handler.

tagging.input.preprocessed=[preprocessed JSON string]

The preprocesesd JSON string must be url encoded.

Tag by image ID

Use an already indexed image to predict tags for.

tagging.input.id=[doc-id]

Set data source

Set the field with stored keywords or labels that should be used as source for predicting tags.

tagging.field=[fieldname]

The field must point to a field of type StrField or TextField with stored=true.

Fine-tune predictions

Term analyzer

Preprocess raw tags by an analyzer chain to normalize and clear tags before prediction.

tagging.analyzer=[none | fieldname | fieldtype]

The default is using the analyzer chain of the tagging.field.

In most cases the default behaviour is sufficient and there is no need to set tagging.analyzer explicitly. However, it is a powerful way to influence the predictions that should be made.

If a fieldname is provided, the index analyzer chain of its fieldtype is used. If no analyzer chain should be used then set this parameter to value none.

Min confidence

Defines the minimum confidence a tag needs to be part of the response.

tagging.threshold=[0.0 - 1.0]

The number of tags is either limited by the tagging.threshold or the tagging.max parameter, whichever limit is reached first.

Max tags

Defines the maximum number of predicted tags.

tagging.max=[>=0]

The number of tags is either limited by the tagging.threshold or the tagging.max parameter, whichever limit is reached first.

Num candidate images

Define how many images are involved to calculate the tags from.

tagging.inspect=[>=0]

Usually, there is no need to set this param explicitly. The default value is suitable for most cases.

Min tags per candidate

Only include images in the tag prediction which have at least n terms (analyzed keywords).

tagging.inspect.minterms=[>=0]

Documents with fewer terms will be ignored and removed from the result set. This is useful if there are images in the collection with only one or two tags. Since tagging uses a statistical approach few keywords lead to poor predictions, since there is no significant variety or frequency of tags.

Depending on the parameter value and your collection it may happen that no tags can be predicted at all, because all relevant images have too few keywords. See tagging.inspect.overhead how to avoid such situations.

Candidate overhead

Request n more images for prediction than actually needed to compensate possible ignored images.

tagging.inspect.overhead=[>=0]

When using tagging.inspect.minterms the specified number of inspected candidate images (controlled via tagging.inspect) may be reduced, because images can be ignored. To compensate ignored images, set an overhead value tagging.inspect.overhead, to request more images than actually needed.

This acts as a buffer to take images from when other images are ignored. Those overhead images won’t be processed as long as no images are ignored.

For instance, 10 out of 50 candidate images (tagging.inspect=50) have only one keyword. They will be ignored, reducing the amount of candidate images that are part of the statistical analysis. In this case the 10 missing candidate images can be taken from the overhead buffer.

The number of internally retrieved documents is tagging.inspect + tagging.inspect.overhead.