search:
url: "http://localhost:9200"
timeout: 8000
Elasticsearch is used in order to provide the main search functionality within Gentics Mesh.
When enabled it is possible to search for:
Search queries can be executed via the dedicated search REST endpoints or GraphQL.
You can use Elasticsearch queries to search for data. Please note that the format of the documents which can be searched differs from the format which Gentics Mesh returns by the REST API. This difference will affect your queries.
The JSON format of stored documents within the Elasticsearch differ from the JSON format that is returned via regular Gentics Mesh endpoints. Thus it is important to know the Elasticsearch document format when building an Elasticsearch query. |
Internally Gentics Mesh will check which roles of the user match up with the needed roles of the documents and thus only return elements which are visible by the user. This is done by nesting the input query inside of an outer boolean query which includes the needed filter terms.
It is not possible to search for specific individual versions. Instead only published and draft versions per project branch are stored in the search index. |
The stored documents within the Elasticsearch indices do not contain all properties which are otherwise available via REST. Only directly accessible values which have minimal dependencies to other elements are stored in order to keep the update effort manageable. |
Unlike Mesh, Elasticsearch does not allow an unbounded page size in the search result. The default value for the perPage parameter is 10 .
|
The Elasticsearch connection can be configured within the mesh.yml
configuration file.
search:
url: "http://localhost:9200"
timeout: 8000
Configuration | Type | Default | Description |
---|---|---|---|
|
String |
URL to the Elasticsearch server. |
|
|
String |
- |
Username for basic authentication. |
|
String |
- |
Password for basic authentication. |
|
String |
- |
Path to the trusted server certificate (PEM format). |
|
String |
- |
Path to the trusted CA certificate (PEM format). |
|
Boolean |
|
Flag to control SSL hostname verification. |
|
Number |
|
Timeout for interactions with the search server. |
|
Boolean |
|
Flag that is used to enable or disable the automatic startup and handling of the embedded Elasticsearch server. |
|
String |
Default JVM Arguments |
Set the JVM arguments for the embedded Elasticsearch server process. |
|
String |
|
Elasticsearch installation prefix. Multiple Gentics Mesh installations with different prefixes can utilize the same Elasticsearch server. |
|
Number |
|
Upper size limit for bulk requests. |
|
Number |
|
Upper limit for the total encoded string length of the bulk requests. |
|
Number |
|
Upper limit for mesh events that are to be mapped to elastic search requests. |
|
Number |
|
The maximum amount of time in milliseconds between two bulkable requests before they are sent. |
|
Number |
|
The maximum amount of time in milliseconds between two successful requests before the idle event is emitted. |
|
Number |
|
The time in milliseconds between retries of elastic search requests in case of a failure. |
|
Number |
|
The amount of retries on a single request before the request is discarded. |
|
Boolean |
|
If true, search endpoints wait for Elasticsearch to be idle before sending a response. |
|
Boolean |
|
If true, the content and metadata of binary fields will be included in the search index. |
|
String |
|
This setting controls the mapping mode of fields for Elasticsearch. When set to STRICT only fields which have a custom mapping will be added to Elasticsearch. Mode DYNAMIC will automatically use the Gentics Mesh default mappings which can be supplemented with custom mappings. |
The search.url
parameter must point to the Elasticsearch installation that you want to use.
You can also use environment variables to configure Gentics Mesh and change these settings. |
It is also possible to completely turn off the search support by setting the search.url
property to null.
You can add Elasticsearch support at any point and invoke via the REST API to populate the search indices.
|
We currently run and test against Elasticsearch version 6.8.1 and 7.4.0. Other versions have not yet been tested.
The search.complianceMode
setting or MESH_ELASTICSEARCH_COMPLIANCE_MODE
environment variable can be used to control the compliance levels that influence Elasticsearch compatibility.
Levels:
ES_7
- Support Elasticsearch 7.x
PRE_ES_7
- Support Elasticsearch 6.x (Default)
Gentics Mesh supports basic authentication and TLS in-transit encrypted connections for Elasticsearch communication.
The Elasticsearch server can be configured to use TLS/SSL encryption. Details on this topic can be found in the Elasticsearch documentation.
You can specify the certificate chain (server certificate and common authority certificate) of the Elasticsearch server in the Gentics Mesh configuration in order to only trust this connection.
search:
url: "https://127.0.0.1:9200"
username: "elastic"
password: "iucee7dohjaedemiShoshie9eiz4af0Oiceish6a"
certPath: "certs/elastic-certificates.crt.pem"
caPath: "certs/elastic-stack-ca.crt.pem"
hostnameVerification: true
The certPath and caPath setting only accept certificate files in PEM format.
|
Elasticsearch servers which already use a trusted certificate don’t need to specify this cert in the configuration. It can however be configured to only trust the given cert chain. |
The settings can be also configured via environment variables:
MESH_ELASTICSEARCH_CERT_PATH
- Override the cert path
MESH_ELASTICSEARCH_CA_PATH
- Override the ca path
MESH_ELASTICSEARCH_HOSTNAME_VERIFICATION
- Override the hostname verification
Connections to Elasticsearch can be authenticated using a user. Details on how to configure this in Elasticsearch can be found in the Elasticsearch documentation.
When xpack.security.enabled
has been enabled in Elasticsearch you can use the elasticsearch-setup-passwords
The authentication details can be configured using the search.username
and search.password
settings in the mesh.yml
.
The username
and password
will only be used when specified. By default no user will be used.
The settings can be also configured via environment variables:
MESH_ELASTICSEARCH_USERNAME
- Override the configured username
MESH_ELASTICSEARCH_PASSWORD
- Override the configured password
Search requests are handled by the
or /api/v2/search
endpoints./api/v2/:projectName/search
If you want the search endpoints to wait until Elasticsearch has processed all pending changes to the index, you can set the query parameter ?wait=true
.
The default value can be configured in the search options.
If no query parameter and no configuration was provided, the wait parameter defaults to true to not break any existing implementations. In the future this will change to false .
|
Endpoint: /api/v2/search/users
{
"query": {
"simple_query_string" : {
"query": "myusername*",
"fields": ["username.raw"],
"default_operator": "and"
}
}
}
Endpoint: /api/v2/search/groups
{
"query": {
"simple_query_string" : {
"query": "testgroup*",
"fields": ["name.raw^5"],
"default_operator": "and"
}
}
}
Endpoint: /api/v2/search/roles
Endpoint: /api/v2/search/nodes
Listed below is an example search query which can be posted to
in order to find all nodes across all projects which were created using the content schema.
The found nodes will be sorted ascending by creator./api/v2/search/nodes
{
"sort" : {
"created" : { "order" : "asc" }
},
"query":{
"bool" : {
"must" : {
"term" : { "schema.name" : "content" }
}
}
}
}
Search nodes by micronode field values
When searching by a micronode field value, the parameters after fields.<fieldname>.
need to be field-<name of microschema>.<field of microschema>
. The full key would thus be: "fields.<field name of schema>.fields-<microschema name>.<field name of microschema>"
Example: Find all nodes which have a micronode list field (vcardlist) that contain at least one micronode which contains the two string fields (firstName, lastName) of the schema 'vcard_details' with the values ("Joe", "Doe"):
{
"query": {
"nested": {
"path": "fields.vcardlist",
"query": {
"bool": {
"must": [
{
"match": {
"fields.vcardlist.fields-vcard_details.firstName": "Joe"
}
},
{
"match": {
"fields.vcardlist.fields-vcard_details.lastName": "Doe"
}
}
]
}
}
}
}
}
Search nodes which are tagged 'Solar' and 'Blue'
The tags field is a nested field and thus a nested query must be used to match the two tags. Please note that you need to use match_phrase
because you want to match the whole tag name. Using match
would cause Elasticsearch to match any of trigram found within the tag name value.
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match_phrase": {
"tags.name": "Solar"
}
},
{
"match_phrase": {
"tags.name": "Blue"
}
}
]
}
}
}
}
}
Search nodes which have been tagged with 'Diesel' for tagFamily 'Fuels':
{
"query": {
"nested": {
"ignore_unmapped": "true",
"path": "tagFamilies.Fuels.tags",
"query": {
"match": {
"tagFamilies.Fuels.tags.name": "Diesel"
}
}
}
}
}
The "ignore_unmapped": "true" will suppress errors which may be returned due to missing paths. This is useful when invoking /rawSearch requests which will directly return the Elasticsearch response. This response would otherwise be cluttered with a lot of errors due to missing path. This can be omitted for these kinds of searches.
|
Search images which were taken in specific areas
GPS information from images will automatically be extracted and added to the search index. It is possible to run a geo search to locate images within a specific area.
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_bounding_box" : {
"fields.binary.metadata.location" : {
"top_left" : {
"lat" : 50.0,
"lon" : 10.0
},
"bottom_right" : {
"lat" : -40.0,
"lon" : 19.0
}
}
}
}
}
}
}
Endpoint: /api/v2/search/projects
Endpoint: /api/v2/search/tags
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must": {
"match_phrase": {
"tags.name": "Twinjet"
}
}
}
}
}
}
}
Endpoint: /api/v2/search/tagFamilies
{
"query": {
"nested": {
"path": "tagFamilies.colors.tags",
"query": {
"match": {
"tagFamilies.colors.tags.name": "red"
}
}
}
}
}
Endpoint: /api/v2/search/schemas
Endpoint: /api/v2/search/microschemas
The paging query parameters are perPage
and page
. It is important to note that
is 1-based and page
can be set to perPage
in order to just retrieve a count of elements.0
Additionally it is also possible to use the
or /api/v2/rawSearch
endpoints./api/v2/:projectName/rawSearch
These endpoints will accept the same query but return a Elasticsearch multi search response instead of the typical Gentics Mesh list response. This is useful if you want to use for example the Elasticsearch highlighing and aggregation features. The endpoint will automatically select the needed indices and modify the query in order to add needed permission checks.
The
endpoint can be used to invoke a manual sync of the search index.POST /api/v2/search/sync
The index sync operation will automatically be invoked when Mesh is being started and a unclean shutdown has been detected. |
You can also recreate all indices if needed via the
endpoint.POST /api/v2/search/clear
This operation will remove all indices which have been created by Mesh and rebuild them one at a time. |
The index sync is a memory intensive operation. The search.syncBatchSize setting can be used to control how much memory will be utilized during the synchronization. 250 MB of the heap memory is sufficient to index 250_000 elements using the differential sync mechanism.
|
Document uploads to Gentics Mesh will automatically be parsed and the containing text will be extracted. This information will also be added to the search index and thus it is possible to search for text within uploaded documents.
You can disable this feature in the global options or per binary field. See the file upload documentation for more information.
Currently uploads which have one of these mimetypes will be processed:
application/pdf
application/msword
text/rtf
application/vnd.ms-powerpoint
application/vnd.oasis.opendocument.text
text/plain
application/rtf
Example binary field within document:
…
"binaryField" : {
"filename" : "mydoc.pdf",
"sha512sum" : "16d3aeae9869d2915dda30866c2d7b77f50dc668daa3a49d2bc6eb6349cf6e895099349b7f8240174a788db967c87947b6a2fd41a353eec99a20358dfd4c9211",
"mimeType" : "application/pdf",
"filesize" : 200,
"file" : {
"content" : "Lorem ipsum dolor sit amet"
}
}
…
It is possible to nest Elasticsearch queries within the GraphQL query in order to filter elements.
See GraphQL examples.
The following section contains document examples which are useful when creating queries. Gentics Mesh transforms elements into these documents which then can be stored within Elasticsearch.
{
"uuid" : "589319933be24ec79319933be24ec7fe",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"username" : "joe1",
"emailaddress" : "joe1@nowhere.tld",
"firstname" : "Joe",
"lastname" : "Doe",
"groups" : {
"name" : [ "editors", "superEditors" ],
"uuid" : [ "df81c23d9ff1450081c23d9ff195005e", "df81c23d9ff1450081c23d9ff195005e" ]
},
"_roleUuids" : [ ],
"version" : "ab7d5eb7"
}
{
"name" : "adminGroup",
"uuid" : "df81c23d9ff1450081c23d9ff195005e",
"_roleUuids" : [ ],
"version" : "ffb6a2da"
}
{
"name" : "adminRole",
"uuid" : "ddcddceb33d648318ddceb33d618314f",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"_roleUuids" : [ ],
"version" : "27844082"
}
{
"uuid" : "adaf48da8c124049af48da8c12a0493e",
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : "1970-01-01T00:00:00Z",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : "1970-01-01T00:00:00Z",
"project" : {
"name" : "dummyProject",
"uuid" : "d3b1840b01464b54b1840b0146cb54f5"
},
"tags" : {
"name" : [ "green", "red" ],
"uuid" : [ "6e6f1d9f055447d2af1d9f055417d289", "6e6f1d9f055447d2af1d9f055417d289" ]
},
"tagFamilies" : {
"colors" : {
"uuid" : "f36400b436d244c3a400b436d244c3f4",
"tags" : [ {
"name" : "green",
"uuid" : "6e6f1d9f055447d2af1d9f055417d289"
}, {
"name" : "red",
"uuid" : "6e6f1d9f055447d2af1d9f055417d289"
} ]
}
},
"_roleUuids" : [ ],
"parentNode" : {
"uuid" : "adaf48da8c124049af48da8c12a0493e"
},
"language" : "de",
"schema" : {
"name" : "content",
"uuid" : "2aa83a2b3cba40a1a83a2b3cba90a1de",
"version" : null
},
"fields" : {
"date" : 1542746513,
"string" : "The name value",
"htmlList" : [ "somehtml", "somehtml", "somehtml" ],
"nodeList" : [ "adaf48da8c124049af48da8c12a0493e", "adaf48da8c124049af48da8c12a0493e", "adaf48da8c124049af48da8c12a0493e" ],
"number" : 0.146,
"node" : "adaf48da8c124049af48da8c12a0493e",
"boolean" : true,
"stringList" : [ "The name value", "The name value", "The name value" ],
"micronode" : {
"microschema" : {
"name" : null,
"uuid" : null
},
"fields-null" : {
"latitude" : 16.373063840833,
"longitude" : 16.373063840833
}
},
"html" : "somehtml",
"numberList" : [ 0.146, 0.146, 0.146 ],
"booleanList" : [ "true", "true", "true" ],
"dateList" : [ 1542746513, 1542746513, 1542746513 ]
},
"displayField" : {
"key" : "string",
"value" : null
},
"branchUuid" : "c5ac82fa1a9c43b6ac82fa1a9ca3b61c",
"version" : "cbfd7867"
}
{
"name" : "dummyProject",
"uuid" : "d3b1840b01464b54b1840b0146cb54f5",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"_roleUuids" : [ ],
"version" : "6714dafb"
}
{
"name" : "red",
"uuid" : "6e6f1d9f055447d2af1d9f055417d289",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"_roleUuids" : [ ],
"tagFamily" : {
"name" : "colors",
"uuid" : "f36400b436d244c3a400b436d244c3f4"
},
"project" : {
"name" : "dummyProject",
"uuid" : "d3b1840b01464b54b1840b0146cb54f5"
},
"version" : "7df128ce"
}
{
"name" : "colors",
"uuid" : "f36400b436d244c3a400b436d244c3f4",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"tags" : {
"name" : [ "red", "green" ],
"uuid" : [ "6e6f1d9f055447d2af1d9f055417d289", "6e6f1d9f055447d2af1d9f055417d289" ]
},
"project" : {
"name" : "dummyProject",
"uuid" : "d3b1840b01464b54b1840b0146cb54f5"
},
"_roleUuids" : [ ],
"version" : "7f598ceb"
}
{
"uuid" : "2715307deafc4ecc95307deafccecc10",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"name" : "geolocation",
"_roleUuids" : [ ],
"version" : "ffb6a2da"
}
{
"name" : "content",
"description" : "Content schema",
"uuid" : "2aa83a2b3cba40a1a83a2b3cba90a1de",
"creator" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"created" : null,
"editor" : {
"uuid" : "589319933be24ec79319933be24ec7fe"
},
"edited" : null,
"_roleUuids" : [ ],
"version" : "4bc088df"
}
The index settings for nodes can be configured within the schema json. Additionally it is also possible to add extra mappings to fields.
This may be desired when if a field needs to be analyzed in a special way or a keyword
field must be added.
An example for such Schema can be seen below. This schema contains additional tokenizer and analyzer which can be used to setup an index that is ready to be used for a full-text search which supports autocompletion and auto suggestion.
{
"container": false,
"name": "CustomSchema",
"elasticsearch": {
"analysis": {
"filter": {
"my_stop": {
"type": "stop",
"stopwords": "_english_"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"tokenizer": {
"basicsearch": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10,
"token_chars": [
"letter"
]
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"my_stop",
"autocomplete_filter"
]
},
"basicsearch": {
"tokenizer": "basicsearch",
"char_filter": [
"html_strip"
],
"filter": [
"my_stop",
"lowercase"
]
},
"basicsearch_search": {
"char_filter": [
"html_strip"
],
"tokenizer": "lowercase"
}
}
}
},
"fields": [
{
"name": "content",
"required": false,
"elasticsearch": {
"basicsearch": {
"type": "text",
"analyzer": "basicsearch",
"search_analyzer": "basicsearch_search"
},
"suggest": {
"type": "text",
"analyzer": "simple"
},
"auto": {
"type": "text",
"analyzer": "autocomplete"
}
},
"type": "string"
}
]
}
Custom mappings can currently only be specified for the following types:
string fields
html fields
string list fields
html list fields
binary fields
Index settings for other elements (e.g: Users, Roles etc) can currently not be configured. |
Gentics Mesh currently supports two way of managing mappings for your content fields.
The mode can be controlled via the search.mappingMode
option or the MESH_ELASTICSEARCH_MAPPING_MODE
environment variable.
The mapping mode setting influences how Elasticsearch mappings for fields will be generated.
DYNAMIC (default)
In this mode Gentics Mesh will automatically create default mappings for any content field.
The custom mapping which can be added via the elasticsearch
parameter will be used to supplement the default mapping.
STRICT
In the strict mode Gentics Mesh will not generate default mappings. Only field mappings which have been specified in the elasticsearch
property will be added to the index mapping.
Please note that your custom mapping will directly replace the field mapping in this mode. If you switch modes you need to adapt your elasticsearch property value. The setting will no longer be added as a nested element.
|
This mode is useful if you want to have finer control of what contents should be added to Elasticsearch.
Fields which have no mapping will by-default also not added to the source document.
Example:
{
"displayField": "name",
"container": true,
"name": "category",
"fields": [
{
"name": "name",
"label": "Name",
"required": true,
"type": "string"
},
{
"name": "description",
"label": "Description",
"required": false,
"type": "string",
"elasticsearch": {
"type": "keyword",
"index": false,
"fields": {
"search": {
"type": "text",
"analyzer": "trigrams"
}
}
}
}
]
}
You can use the POST /api/v2/utilities/validateSchema endpoint to validate your schema and check what index mapping is actually being generated.
|
If you just want to add your field to the source document you can create a mapping with "index": false .
|
It is also possible to define different settings / mappings for different languages. This is useful to use language specific stemmers and stop word lists.To do so, add the _meshLanguageOverride
property to the elasticsearch
object.
In this object you can override settings / mappings for each language. You can also provide the same setting / mapping for multiple language by using a
comma separated list of languages as the key.
For each overridden language, Gentics Mesh will create additional indices in Elasticsearch. This might affect the resources used by Elasticsearch. |
Node contents of that language will be saved in that index. When a node content is created with a language that is not overridden, the node content will be saved in the default index. The settings / mappings of the default index can be configured like shown in the example above.
Here is an example using this feature:
{
"name": "page",
"elasticsearch": {
"_meshLanguageOverride": {
"de": {
"analysis": {
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": "_german_"
}
}
}
},
"ja,zh,ko": {
"analysis": {
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": ["_english_", "_cjk_"]
}
}
}
}
},
"analysis": {
"analyzer": {
"my_stop_analyzer": {
"type": "stop",
"stopwords": "_english_"
}
}
}
},
"fields": [
{
"name": "title",
"type": "string",
"elasticsearch": {
"basicsearch": {
"type": "text",
"analyzer": "my_stop_analyzer"
}
}
},
{
"name": "content",
"type": "string",
"elasticsearch": {
"_meshLanguageOverride": {
"fr": {
"basicsearch": {
"type": "text",
"analyzer": "standard"
}
}
},
"basicsearch": {
"type": "text",
"analyzer": "my_stop_analyzer"
}
}
}
]
}
You can add custom mappings to the mimeType
and file.content
field. The elasticsearch
property needs to contain a custom mapping for each type.
Example:
{
"displayField": "name",
"segmentField": "binary",
"container": false,
"description": "Image schema",
"name": "image",
"fields": [
{
"name": "name",
"label": "Name",
"required": false,
"type": "string"
},
{
"name": "binary",
"label": "Image",
"required": false,
"type": "binary",
"elasticsearch": {
"mimeType": {
"raw": {
"type": "keyword",
"index": true
}
},
"file.content": {
"raw": {
"type": "keyword",
"index": true
}
}
}
}
]
}
By default all the (micro)schemas in Mesh are subject to the indexing. This may be an unwanted behavior in the cases of storing the sensitive data, that should not in any case leave Mesh. For this case every Schema, Microschema and each field definition of those are equipped with the noIndex
flag, allowing the exclusion of the target from the indexing process, as well as removing the already existing indices. Simply use Mesh REST API or Java client for the management of noIndex
over schemas, microschemas and fields.