Tag images
The following parameters are only supported by the /tag
endpoint.
Tagging input
There are several ways to provide the image to tag.
Tag by upload
Upload and tag a Base64 encoded image file (also known as Data-URI).
tagging.input.data=[base64-encoded-image-file]
Use POST
request to be able to upload larger files without hitting URI length limits. Send the tagging.input.data
parameter in POST
body.
Supported image types are jpg
, png
and gif
.
Tag by image URL
Tag an image referenced by URL.
tagging.input.url=[image-url]
HTTPS
/ HTTP
. The image-url
must be RFC2396 compliant and point to a valid, accessible image ressource or an error will be returned.
Tag by preprocessed image
Tag an image using the preprocessed JSON returned by the /analyze
handler.
tagging.input.preprocessed=[preprocessed JSON string]
The preprocesesd JSON string must be url encoded.
Tag by image ID
Use an already indexed image to predict tags for.
tagging.input.id=[doc-id]
Set data source
Set the field with stored keywords or labels that should be used as source for predicting tags.
tagging.field=[fieldname]
The field must point to a field of type StrField
or TextField
with stored=true
.
Fine-tune predictions
Term analyzer
Preprocess raw tags by an analyzer chain to normalize and clear tags before prediction.
tagging.analyzer=[none | fieldname | fieldtype]
tagging.field
.
In most cases the default behaviour is sufficient and there is no need to set tagging.analyzer
explicitly.
However, it is a powerful way to influence the predictions that should be made.
If a fieldname is provided, the index analyzer chain of its fieldtype is used.
If no analyzer chain should be used then set this parameter to value none
.
Min confidence
Defines the minimum confidence a tag needs to be part of the response.
tagging.threshold=[0.0 - 1.0]
The number of tags is either limited by the tagging.threshold
or the tagging.max
parameter, whichever limit is reached first.
Max tags
Defines the maximum number of predicted tags.
tagging.max=[>=0]
The number of tags is either limited by the tagging.threshold
or the tagging.max
parameter, whichever limit is reached first.
Num candidate images
Define how many images are involved to calculate the tags from.
tagging.inspect=[>=0]
Usually, there is no need to set this param explicitly. The default value is suitable for most cases.
Min tags per candidate
Only include images in the tag prediction which have at least n terms (analyzed keywords).
tagging.inspect.minterms=[>=0]
Documents with fewer terms will be ignored and removed from the result set. This is useful if there are images in the collection with only one or two tags. Since tagging uses a statistical approach few keywords lead to poor predictions, since there is no significant variety or frequency of tags.
Depending on the parameter value and your collection it may happen that no tags can be predicted at all, because all relevant images have too few keywords. See tagging.inspect.overhead
how to avoid such situations.
Candidate overhead
Request n more images for prediction than actually needed to compensate possible ignored images.
tagging.inspect.overhead=[>=0]
When using tagging.inspect.minterms
the specified number of inspected candidate images (controlled via tagging.inspect
) may be reduced, because images can be ignored.
To compensate ignored images, set an overhead value tagging.inspect.overhead
, to request more images than actually needed.
This acts as a buffer to take images from when other images are ignored. Those overhead images won’t be processed as long as no images are ignored.
For instance, 10 out of 50 candidate images (tagging.inspect=50
) have only one keyword. They will be ignored, reducing the amount of candidate images that are part of the statistical analysis. In this case the 10 missing candidate images can be taken from the overhead buffer.
The number of internally retrieved documents is tagging.inspect + tagging.inspect.overhead
.