`/update` endpoint

The /update handler is used to add and delete uploaded documents to the index.

Storage scheme

Data is stored in so called documents. A document contains one or more fields, which are basically key:value pairs with fieldname:fieldvalue. Fields are typed and depending on the field type, different search operations can be performed on that field (e.g. range queries for numerical fields). The schema defines all fields including their field types and indexing options (e.g. whether they are indexed, stored or multivalued).

The following json represents a simple document with just one field:

{
  "id":"123"
}

Preconfigured schema

To ease the first usage of Flow, we have preconfigured the schema. The following preconfigured fields are worth knowing:

id - mandatory unique document identifier as string
image - URL pointing to the image to be visually searchable
labels - one or more labels like keywords or categories as string array
location - geo coordinates for spatial search as string (lat,lon)

The id field is the only required field, all other fields are optional.

Automatic field adding

Add new fields as you need them. The field type is automatically derived based on the first value Flow sees for that field. Boolean, Double, Long and String fields are supported. You can also add dates as string which are detected and converted to actual date fields to allow time-based searches such as time spans.

Supported fields and types

The json below defines one valid document and demonstrates different fields and field types.

{
  "id":"1",
  "image":"http://localhost/my-image.jpg",
  "labels":["category1" "multivalued", "another label"],    
  "location":"52.50460958245561,13.418979410930593",
  "new_string_field":"the location above points to our office",
  "new_boolean_field":true,
  "new_date_field":"2023-04-12",
  "new_numeric_field":123,
  "new_double_field":0.125
}

To keep things simple, all new fields have to be single-valued.

Fulltext search

All values of string fields are copied to a default search field (text). This enables a full-text search across all string fields, without the need to specify a concrete field to search in.

Spatial search using geo coordinates

You can use the preconfigured location field to add a geo-location to your doc. The field expects the latitude and longitude GPS coordinates as one comma separated string "lat,lon".

Add documents

Single JSON docMultiple JSON docsvia JSON file

curl --request POST \
  --url 'http://localhost:8983/api/cores/my-collection/update' \
  --header 'Content-Type: application/json' \
  --data '  {
    "id" : "id-1",
    "image" : "YOUR_IMAGE_URL",
    "labels" : ["ball","sports"]
  }'

Add a list of documents, by wrapping the JSON documents into a JSON array.

curl --request POST \
  --url 'http://localhost:8983/api/cores/my-collection/update' \
  --header 'Content-Type: application/json' \
  --data '  [
  {
    "id" : "id-1",
    "image" : "YOUR_IMAGE_URL",
    "category" : "category1"
  },
  {
    "id" : "id-2",
    "image" : "YOUR_IMAGE_URL",
    "category" : "category2"
  }
  ]'

curl --request POST \
  --url 'http://localhost:8983/api/cores/my-collection/update' \
  --header 'Content-Type: application/json' \
  --data @docs.json

Speed up indexing with modules.apply

Use modules.apply in your request to use only specific modules for image analysis, enhancing performance by skipping analysis steps unnecessary for your use case. For example, http://localhost:8983/api/cores/my-collection/update?modules.apply=content focuses on visual searches. See API reference for details.

Add local images

Flow is designed to work in a distributed environment with images accessible via URL. Therefore, the image field expects a URL as value, which is not available when using local images.

Setup local file server

If you run Flow locally we recommend to serve your local images via a simple file server.

Serve images in the working directory using Python server

python -m http.server --directory . 8001

The above command starts a python server with your current working directory as file root. Imagine the file structure of your working directory as follows:

.
├── image1.jpg
└── folder1
    └── image2.jpg

You can then add your local images via URLs.

curl --request POST \
  --url 'http://localhost:8983/api/cores/my-collection/update' \
  --header 'Content-Type: application/json' \
  --data '  [
  {
    "id" : "id-1",
    "image" : "http://localhost:8001/image1.jpg"
  },
  {
    "id" : "id-2",
    "image" : "http://localhost:8001/folder1/image2.jpg"
  }
  ]'

Use --network host when running Docker image to access the localhost image URLs

Without file server

If you want to index your local images you first have to analyze the images using the /analyze handler and then import the pre-processed JSON data using the /update handler and the special import field.

Please follow the steps described here.

Local file paths cannot be displayed when using wt=html

Due to browser security restrictions it is not allowed to load local images via local file paths in websites. The html response writer can only display images that are accessible via URL (e.g. localhost).

Delete documents

Please note, the context path for the delete requests is different to all other update requests. Due to limitations in the underlying framework you need to use /solr instead of /api/cores for delete requests.

Delete by ID

Delete the documents with ids 1, 2 and 3.

curl --request POST \
  --url 'http://localhost:8983/solr/my-collection/update' \
  --header 'Content-Type: application/json' \
  -d '{"delete":["1","2","3"]}'

Delete by query

To delete all documents from index to empty the collection, you can send a wildcard query. The query *:* matches all documents and deletes them.

Delete allDelete by fieldvalue

curl --request POST \
  --url 'http://localhost:8983/solr/my-collection/update' \
  --header 'Content-Type: application/json' \
  -d '{"delete":{"query":"*:*"}}'

curl --request POST \
  --url 'http://localhost:8983/solr/my-collection/update' \
  --header 'Content-Type: application/json' \
  -d '{"delete":{"query":"labels:dog"}}'

To delete docs matching a certain criteria, just replace the wildcard with a different query (field:value).

Commit index changes

Changes to the index are only visible after committing the changes. Flow auto-commits every 15 seconds. To ensure you see the latest index changes you can manually trigger a commit.

curl --request POST \
'http://localhost:8983/api/cores/my-collection/update?commit=true&openSearcher=true'

/update endpoint

Storage scheme

Preconfigured schema

Automatic field adding

Supported fields and types

Fulltext search

Spatial search using geo coordinates

Add documents

Add local images

Setup local file server

Without file server

Delete documents

Delete by ID

Delete by query

Commit index changes

`/update` endpoint