(Quick Reference)

3 Mapping - Reference Documentation

Authors: Noam Y. Tenne, Manuarii Stein, Stephane Maldini, Serge P. Nekoval, Marcos Carceles

Version: 0.0.4.6

3 Mapping

From version 0.0.4.0 in addition to the indices generated by the plugin based on the domain objects package names or configuration, two new aliases are created for every index: <indexName>_read and <indexName>_write. These two aliases are used by the plugin to index and query from Elasticsearch and are needed to centralise the choice of index to use during mapping migrations when the 'alias' strategy is used and there are multiple instances of the application.

3.1 QuickStart

Default mapping

To declare a domain class to be searchable, the simplest way is to define the following static property in the code:
static searchable = true
The plugin will generate a default mapping for each properties of the domain.

Custom mapping

You can customize how each properties are mapped to the index using a closure. The syntax is similar to GORM's mapping DSL.
static searchable = {
    // mapping DSL…
}
See below for more details on the mapping DSL.

Limit properties with only/except

only and except are used to limit the properties that are made searchable. You may not define both except & only settings at the same time.

The following code will only map the 'message' property, any others will be ignored.

class Tweet {
    static searchable = {
        only = 'message'
    }
    String message
    String someUselessField
}

The following code will map all properties except the one specified.

class Tweet {
    static searchable = {
        except = 'someUselessField'
    }
    String message
    String someUselessField
}

You can use a Collection to specify several properties.

class Tweet {
    static searchable = {
        except = ['someUselessField', 'userName']
    }
    String message
    String userName
    String someUselessField
}

The properties that are ignored will not be sent to ElasticSearch. It also means that when you will get back a domain from ElasticSearch, some fields that are not supposed to be null, may still be null.

Including transients

How the plugin manages transient properties is controlled by the elasticSearch.includeTransients configuration property. If this is set to false only transient properties explictly included in only will be mapped and searchable, if set to true, all domain class properties will be mapped, including transients.

The following are valid examples

//assert grailsApplication.config.elasticSearch.includeTransients == false
class Person {
    String firstName
    String lastName
    String getFullName() {
        firstName + " " + lastName
    }
    static transients = ['fullName']
    static searchable = {
        only = ['fullName']
    }
}
// new Person(firstNameme: "Nikola", lastName: "Tesla")
// can be found using:
// def tesla = Person.search("Nikola Tesla").searchResults.first()

//assert grailsApplication.config.elasticSearch.includeTransients == true
class Multiplication {
    int opA
    int opB
    int getResult() {
        opA * opB
    }
    static transients = ['result']
    static searchable = true
}
// new Multiplication(opA: 2, opB: 3)
// can be found using:
// def multiplication = Multiplication.search("2").searchResults.first()
// def multiplication = Multiplication.search("3").searchResults.first()
// def multiplication = Multiplication.search("6").searchResults.first()

From the examples above, once the domain object is found, its transient values will be calculated from the information stored on ElasticSearch: multiplication.result == 6, but tesla.fullName == "null null", as firstName and lastName where not indexed. This behaviour can be prevented by creating convenient setters for the transient properties.

Transients and collections

When transient properties are collections the only way the plugin can define the correct ElasticSearch mapping during boot is if the element types are explicitly defined on the grails domain object. For instances of Collection this can be achieved by defining its type on the hasMany property (otherwise the ElasticSearch type will be defined as object). This is not required for arrays.

Some valid examples:

class Tweet {
    String message
    List getHashtags() { … }
    static transients = ['hashtags']
    static hasMany = [hashtags: String]
    static searchable = {only = 'hashtags' }
}

class FamilyGuy {
    String wife
    String son
    String daughter
    String baby
    String[] getRelatives() { … }
    static transients = ['relatives']
    static searchable = { only = 'relatives' }
}

3.2 Class Mapping

root

Determine if the domain class will have its own index or not. Take a boolean as parameter, and is set to true by default.
class Preference {
    static searchable = {
        root false
    }
    // …
}

class Tag { static searchable = true // … }

class Tweet { static searchable = { message boost:2.0 } // … }

In this code, the classes Tweet and Tag are going to have their own index. The class Preference will not. It also mean that any search request will never return a Preference-type hit. The dynamic method search will not be injected in the Preference domain class. The domains not root-mapped can still be considered searchable, as they can be components of another domain which is root-mapped. For example, considered the following domain:
class User {
    static searchable = {
        userPreferences component:true
    }

Preference userPreferences }

When searching, any matches in the userPreferences property will be considered as a User match.

all

Set default analyzer for all domain class fields.

static searchable = {
  all = [analyzer: 'russian_morphology']
}

static searchable = {
  all = false
}

When disabling the all field, it is a good practice to set index.query.default_field to a different value (for example, if you have a main 'message' field in your data, set it to message).

3.3 Properties mapping

You can customize the mapping for each domain properties using the closure mapping. The syntax is simple:
static searchable = {
    propertyName option1:value, option2:value, …
}

Available options

Option nameValuesDescription
boostNumberA decimal boost value. With a positive value, promotes search results for hits in this property; with a negative value, demotes search results that hit this property.
componenttrue, falseTo use only on domain (or collection of domains), make the property a searchable component.
converterA ClassA Class to use as a converter during the marshalling/unmarshalling process for that peculiar property. That class must extends the PropertyEditorSupport java class.
excludeFromAlltrue, falsedetermines if the property is to append in the "_all" field. Default to true.
index"no", "not_analyzed", "analyzed".How or if the property is made into searchable element. One of "no", "not_analyzed" or "analyzed".
referencetrue, falseTo use only on domain (or collection of domains), make the property a searchable reference.
parenttrue, falseA boolean value to be used in conjunction with the reference or component property . Set to true if the referenced field should be mapped as the parent of this document. Default set to false.
multi_fieldtrue, falseA boolean value. Maps the value of the field twice; Once with it being analyzed, and once with it being not_analyzed under untouched. Default set to false.
geoPointtrue, falseMaps the field to a geo_point. Default: false
aliasStringA string value. The field noted with this parameter will be duplicated to an alias
dynamictrue, falseOnly available for String properties. Determines whether this field should be dynamicly mapped by elasticsearch.

3.3.1 Parent-Child

To map a parent/child relationship, the child element must either contain the parent element as a component or reference it as a referenced document. This component must be mapped as a parent in the child element.

Example

class ParentElement {
…
}

class EmbeddingChild { ParentElement parentElement

static searchable = { parentElement parent: true, component: true } }

class ReferencingChild { ParentElement parentElement

static searchable = { parentElement parent: true, reference: true } }

3.3.2 GeoPoint

A geographic location can be mapped to a geo_point. The field for the longitude has to be named lon and the field for the latitude has to be named lat

Example

class GeoPoint {

Double lat Double lon

static searchable = { root false } }

class Building {

String name GeoPoint location

static searchable = { location geoPoint: true, component: true } }

3.3.3 Alias

A field can be aliased. This is useful in situations where another service may expect certain tags. For example, Kibana uses an @timestamp field to filter report records by date.

Example

class Session {

Date loginTime

static searchable = { loginTime alias:'@timestamp' } }

3.3.4 Dynamic

Elasticsearch can map field contents as dynamic objects. This is especially useful if you store JSON Strings in your database and want to make those objects searchable in elasticsearch.

Example

class Session {

String jsonData

static searchable = { dynamic: true } }

Session session = new Session() session.jsonData = ([foo: 'bar'] as JSON).toString()

The default mapping would make the jsonData field an escaped String field and a search for jsonData.foo = bar would result in no result. With dynamic mapping enabled for this field, we enable JSON handling of this field and tell elasticsearch to map this field dynamicly. The result is that a search for jsonData.foo=bar would result in a search hit. Attention: This will only work on String fields and will result in an error if the String is no valid json

3.4 Searchable Component-Reference

The plugin support a similar searchable-component & searchable-reference behavior from Compass when you are dealing with domain association. See below to find out about the difference between both mapping modes.

3.4.1 Searchable Reference

The searchable-reference mapping mode is the default mode used for association, and requires the searchable class of the association to be root-mapped in order to have its own index. With this mode, the associated domains are not completely marshalled in the resulting JSON document: only the id and the type of the instances are kept. When the document is retrieved from the index, the plugin will automatically rebuild the association from the indices using the stored id.

Example

class MyDomain {
    // odom is an association with the OtherDomain class, set as a reference
    OtherDomain odom

static searchable = { odom reference:true } }

// The OtherDomain definition, with default searchable configuration class OtherDomain { static searchable = true

String field1 = "val1" String field2 = "val2" String field3 = "val3" String field4 = "val4" }

When indexing an instance of MyDomain, the resulting JSON documents will be sent to ElasticSearch:

{
    "mydomain": {
        "_id":1,
        "odom": { "id":1 }
    }
}

{ "otherdomain": { "_id":1, "field1":"val1", "field2":"val2", "field3":"val3", "field4":"val4" } }

3.4.2 Searchable Component

The searchable-component mapping mode must be explicitly set, and does not require the searchable class of the association to be root-mapped. With this mode, the associated domains are nested in the parent document.

Example

class MyDomain {
    // odom is an association with the OtherDomain class, set as a reference
    OtherDomain odom

static searchable = { odom component:true } }

// The OtherDomain definition, with default searchable configuration class OtherDomain { static searchable = true

String field1 = "val1" String field2 = "val2" String field3 = "val3" String field4 = "val4" }

When indexing an instance of MyDomain, the resulting JSON document will be sent to ElasticSearch:

{
    "mydomain": {
        "_id":1,
        "odom": {
            "_id":1,
            "field1":"val1",
            "field2":"val2",
            "field3":"val3",
            "field4":"val4"
        }
    }
}

If you'd rather that the reference object be mapped with type 'inner' rather than the default 'nested', set the 'component' key with a value of 'inner' rather than 'true':

class MyDomain {
    // odom is an association with the OtherDomain class, set as a reference
    OtherDomain odom

static searchable = { odom component: 'inner' } }

3.5 Mapping Migrations

During the application startup the application will attempt to create the needed indices on Elasticsearch and create the type mappings defined by the user. If these indices and mappings already existed on the Elasticsearch cluster (ie. an older verion of the application was running against it) and the new mapping definitions differ with the existing ones there's the potential for a Mapping conflict. This section describes how to configure the application to deal with this scenario.

It is important to highlight that not all type mapping changes will result on a conflict. Ie. adding a new field to a mapping does not result in a conflict whilst changing a property from component:'inner' to nested or viceversa, will. These strategies will only be needed and applied when a conflicting mapping is found.

Migration Strategies

The migration strategy is defined by the elasticSearch.migration.strategy configuration property and it accepts three values:
  • 'none'
  • 'delete'
  • 'alias'

The default strategy is 'alias' as it is the only strategy that can achieve zero-downtime migrations and thus recommended by Elasticsearch

These values are described on more detail further ahead

Migration Strategy 'none'

This option keeps the original behaviour the plugin used before the Migration Strategies were implemented. When a Mapping Merge conflict id identified the event will be logged and an Exception will be logged. It will be responsability for the application administrator to manually fix the problem. This configuration was left as a backwards compatibility and it will prevent the application from booting successfully, therefore we discourage teams from using this option.

Migration Strategy 'delete'

When choosing this option, when a conflict occurs installing mapping, the application will delete the existing mapping for the type, alongside with all content indexed on that index and type and recreated the mapping. There are a couple of important details on this information:

  • Only documents indexed on the conflicting mapping will be deleted, any other document on a different mapping on the same (or other) index will remain untouched.
  • Deleted documents can be automatically reindexed on startup by using the elasticSearch.bulkIndexOnStartup configuration property (See below)
  • Using this configuration there will always be a time window (between deletion and reindexation) where documents can't be found by search, therefore this option cannot achieve a zero-downtime deployment

See Dealing with deleted content below for more details on automatic indexing.

Migration Strategy 'alias'

This is the migration strategy recommended by Elasticsearch.

To better understand this strategy we will describe a typical 'alias' migration.

Elasticsearch contains
  index 'myapplication.store_v27' with types 'car' and 'motorbike'
  alias 'myapplication.store' pointing to 'myapplication.store_v27'
  'myapplication.store_v27/car' contains 520 documents
  'myapplication.store_v27/motorbike' contains 12 documents
  index 'myapplication.admin_v0' with type 'quote'
  alias 'myapplication.admin' pointing to 'myapplication.admin_v0'
  'myapplication.admin_v0/quote' contains 3200 documents

The application is configured to use indexes based on package names 'myapplication.store' and 'myapplication.admin' (which as we already explained are actually aliases that point to versioned indices)

The team introduced a change on the Car domain that results in a conflict on the 'car' mapping

The application starts up Tries to install the mapping for 'motorbike', it detects the conflict Creates a new index called 'myapplication.store_v28' Creates mappings 'myapplication.store_v28/car' and 'myapplication.store_v28/motorbike' Points all indexing requests for Car and Motorbike to the new index, while queries still happen on 'myapplication.store' On Boostrap (bulkIndexOnStartup) It indexes 520 cars into 'myapplication.store_v28/car' It indexes 12 motorbikes into 'myapplication.store_v28/motorbike' Switches the 'myapplication.store' alias to point to 'myapplication.store_v28' Now all cars are indexed according to the new mapping Now all motorbikes are indexed according to the new mapping

All content can be queried at all times, during Bootstrap bulkIndexOnStartup content will be retrieved from the old index.

Eventhough there wasn't a conflict on 'car', all cars needed to be reindexed as they lived on the same index.

There are three potential scenarios when using the 'alias' strategy:

ScenarioBehaviour
The index (ie. 'myapplication.store') does not existOn this case there is not possibility of conflicts, as no previous mapping exist. However the application will behave slightly different than on the other to scenarios. Instead of creating the index (ie. 'myapplication.store'), it will create version 0 of it (ie. 'myapplication.store_v0') and an alias pointint to it. This is to facilitate the creation of future versions in case of conflict.
Alias exists pointing to a version (ie. 'myapplication.store' -> 'myapplication.store_v27')If there's a conflict on a mapping on the index, it will create a new version (ie. 'myapplication.store_v28'), reindex the content or not depending on the value of the elasticSearch.bulkIndexOnStartup configuration property and point the alias to the new version once done.
Index already exists (ie. 'myapplication.store')Elasticsearch cannot rename an index or create an alias with the same name as an index. The two alternatives here are to delete the index or fail the migration. This is controlled by the elasticSearch.migration.aliasReplacesIndex configuration property, if set to true, it will delete the index and proceed the same way as when the index did not exist. The deleted documents will be reindexed or not depending on the value of the elasticSearch.bulkIndexOnStartup. This is the only scenario where there is content loss/downtime using the 'alias' strategy.

In the case you wanted to create a new version of an index, but not change where the alias points to (ie. for testing or if you wanted to perform extra tasks on the index before updating the alias), the elasticSearch.migration.disableAliasChange configuration property can be used

Aliases will only point to the new version of the index once all content is reindexed (if chosen to). Meanwhile, all index requests, either by elasticSearchService or using dynamic finders will go to the new version of the index, whilst queries will go to the old version of the index.

See Dealing with deleted content below for more details on automatic indexing.

Dealing with deleted content

Using the 'delete' or 'alias' strategy may lead to deleting content stored on Elasticsearch. This content can be automatically reindexed using the elasticSearch.bulkIndexOnStartup. The duration of this process will depend on the amount of content to index.

When this property is set to true all content will be deleted. When set to 'deleted' only the domain classes which documents where deleted will be indexed. In either case, when using the 'alias' strategy, once all content is indexed all aliases will point to the latest version of the index.