/update
endpoint
The /update
handler is used to add and delete uploaded documents to the index.
Storage scheme
Data is stored in so called documents.
A document contains one or more fields, which are basically key:value
pairs with fieldname:fieldvalue
.
Fields are typed and depending on the field type, different search operations can be performed on that field (e.g. range queries for numerical fields).
The schema defines all fields including their field types and indexing options (e.g. whether they are indexed, stored or multivalued).
The following json represents a simple document with just one field:
{
"id":"123"
}
Preconfigured schema
To ease the first usage of Flow, we have preconfigured the schema. The following preconfigured fields are worth knowing:
id
- mandatory unique document identifier as stringimage
- URL pointing to the image to be visually searchablelabels
- one or more labels like keywords or categories as string arraylocation
- geo coordinates for spatial search as string (lat,lon
)
The id
field is the only required field, all other fields are optional.
Automatic field adding
Add new fields as you need them. The field type is automatically derived based on the first value Flow sees for that field.
Boolean
, Double
, Long
and String
fields are supported. You can also add dates
as string which are detected and converted to actual date fields to allow time-based searches such as time spans.
Supported fields and types
The json below defines one valid document and demonstrates different fields and field types.
{
"id":"1",
"image":"http://localhost/my-image.jpg",
"labels":["category1" "multivalued", "another label"],
"location":"52.50460958245561,13.418979410930593",
"new_string_field":"the location above points to our office",
"new_boolean_field":true,
"new_date_field":"2023-04-12",
"new_numeric_field":123,
"new_double_field":0.125
}
To keep things simple, all new fields have to be single-valued.
Fulltext search
All values of string fields are copied to a default search field (text
). This enables a full-text search across all string fields, without the need to specify a
concrete field to search in.
Spatial search using geo coordinates
You can use the preconfigured location
field to add a geo-location to your doc.
The field expects the latitude and longitude GPS coordinates as one comma separated string "lat,lon"
.
Add documents
curl --request POST \
--url 'http://localhost:8983/api/cores/my-collection/update' \
--header 'Content-Type: application/json' \
--data ' {
"id" : "id-1",
"image" : "YOUR_IMAGE_URL",
"labels" : ["ball","sports"]
}'
Add a list of documents, by wrapping the JSON documents into a JSON array.
curl --request POST \
--url 'http://localhost:8983/api/cores/my-collection/update' \
--header 'Content-Type: application/json' \
--data ' [
{
"id" : "id-1",
"image" : "YOUR_IMAGE_URL",
"category" : "category1"
},
{
"id" : "id-2",
"image" : "YOUR_IMAGE_URL",
"category" : "category2"
}
]'
curl --request POST \
--url 'http://localhost:8983/api/cores/my-collection/update' \
--header 'Content-Type: application/json' \
--data @docs.json
Add local images
Flow is designed to work in a distributed environment with images accessible via URL.
Therefore, the image
field expects a URL as value, which is not available when using local images.
Setup local file server
If you run Flow locally we recommend to serve your local images via a simple file server.
python -m http.server --directory . 8001
.
├── image1.jpg
└── folder1
└── image2.jpg
You can then add your local images via URLs.
curl --request POST \
--url 'http://localhost:8983/api/cores/my-collection/update' \
--header 'Content-Type: application/json' \
--data ' [
{
"id" : "id-1",
"image" : "http://localhost:8001/image1.jpg"
},
{
"id" : "id-2",
"image" : "http://localhost:8001/folder1/image2.jpg"
}
]'
Use --network host
when running Docker image to access the localhost
image URLs
Without file server
If you want to index your local images you first have to analyze the images using the /analyze
handler and
then import the pre-processed JSON data using the /update
handler and the special import
field.
Please follow the steps described here.
Local file paths cannot be displayed when using wt=html
Due to browser security restrictions it is not allowed to load local images via local file paths in websites.
The html response writer can only display images that are accessible via URL (e.g. localhost
).
Delete documents
Please note, the context path for the delete requests is different to all other update requests.
Due to limitations in the underlying framework you need to use /solr
instead of /api/cores
for delete requests.
Delete by ID
Delete the documents with ids 1
, 2
and 3
.
curl --request POST \
--url 'http://localhost:8983/solr/my-collection/update' \
--header 'Content-Type: application/json' \
-d '{"delete":["1","2","3"]}'
Delete by query
To delete all documents from index to empty the collection, you can send a wildcard query.
The query *:*
matches all documents and deletes them.
curl --request POST \
--url 'http://localhost:8983/solr/my-collection/update' \
--header 'Content-Type: application/json' \
-d '{"delete":{"query":"*:*"}}'
curl --request POST \
--url 'http://localhost:8983/solr/my-collection/update' \
--header 'Content-Type: application/json' \
-d '{"delete":{"query":"labels:dog"}}'
To delete docs matching a certain criteria, just replace the wildcard with a different query (field:value
).
Commit index changes
Changes to the index are only visible after committing the changes. Flow auto-commits every 15 seconds. To ensure you see the latest index changes you can manually trigger a commit.
curl --request POST \
'http://localhost:8983/api/cores/my-collection/update?commit=true&openSearcher=true'