Elasticsearch Publishing

The Web UI Search Screen offers a modern faceted search experience. This requires STEP data to be published to an Elasticsearch cluster that is accessible from the STEP application servers. For the end user, the Search Screen provides search results that can be refined and modified on-the-fly.

Note: Elasticsearch can display product, classification, and/or asset data only, based on configuration.

Important: Before starting the configuration outlined in this topic, contact your Stibo Systems account manager or partner manager for assistance. Activation and configuration for the faceted Search screen, Elasticsearch, and corresponding components / functionality should not be done without the assistance of Stibo Systems.

Important:  

  • To ensure current STEP data is displayed on the Web UI faceted Search Screen, verify that the event processor indicated by the active Elasticsearch configuration is enabled, is set to Read Events, and is running on a reasonable schedule.

  • Data displayed in workbench is fully aligned with data available from the faceted Search Screen only when all events are processed.

  • The amount of data being published directly impacts the amount of time required for the events to be processed. When publishing large amounts of data, it is recommended to invoke the event process during user down time, such as overnight or on the weekend.

  • Calculated attributes have a noticeable negative effect on publishing performance. It is recommended to avoid publishing calculated attributes and to instead use a business action to set the values prior to publishing. When it is not possible to avoid calculated attributes completely, publish no more than five (5) calculated attributes to reduce the performance impact.

Elasticsearch Event Processor Troubleshooting

Examples of event processor errors and suggested resolutions are included below:

  • Maximum shards open - The execution log on the event processor includes an error message similar to:

    ...this action would add [4] total shards, but this cluster currently has [997]/[1000] maximum shards open;]]'

    Each index in Elasticsearch is divided into one or more shards to protect against hardware failures. A single Elasticsearch configuration in workbench creates at least one for each workspace / context pair in the Elasticsearch database. For easy identification and management of indexes in Kibana, each index is named with the same prefix as defined in the Elasticsearch configuration.

    Making a change to the prefix and/or creating a new Elasticsearch configuration and then publishing to Elasticsearch creates 2 indexes for each context (one for the main workspace and one for the approved workspace). For example, on a system with 20 contexts, each time an Elasticsearch configuration is published, 40 indexes are created. All indexes consume shards and removing the unused indexes can increase the number of available shards.

    Resolution

    Reduce the number of shards required based on your system setup.

    1. Review and possibly update the sharedconfig.properties file as follows:

      For a multiple server setup, the default settings for these sharedconfig.properties entries are desired since the Elasticsearch.NumberOfReplicas=1 and Elasticsearch.NumberOfShards=2 allow for multiple shards per index.

      For a single server setup, update the default sharedconfig.properties settings to Elasticsearch.NumberOfReplicas=0 and Elasticsearch.NumberOfShards=1.

      Changes to the properties file, outlined above, are implemented when the server is restarted.

    2. Delete unused indexes in Kibana as defined in the section below.

    3. Routinely review indexes in Kibana and remove any not required.

Publish to the Elasticsearch Database

Follow the steps below to publish data from STEP to the Elasticsearch database for use in a faceted search:

  1. Generate events for the STEP data required in the Search Screen using the republish action as defined in the Event-Based OIEP Forward, Rewind, Purge, and Republish topic of the Data Exchange documentation here. The republish background process generates events for the configured products, classifications, and/or assets.
  2. Publish STEP data to the Elasticsearch database by invoking the event processor running the Elasticsearch Configuration as defined in the Running an Event Processor topic of the System Setup documentation here. The event processor background process creates indexes and publishes products, classifications, and/or assets to the Elasticsearch database.

Reindex the Elasticsearch Database

For an existing Elasticsearch database, use one of the following sets of steps to reindex Elasticsearch when required. For example, when changes are made by Stibo Systems to the Elasticsearch schema. Both methods remove all data from the Elasticsearch database and then publish the current STEP data to Elasticsearch based on the active configuration.

Under rare circumstances, a system can be safely reverted to the previous configuration and index. For example, when a schema change involves functionality not implemented by the system. In such a scenario, the 'Create New Indexes and Republish' method can be used.

Important: When you need to republish a large number of objects, or all objects, first review the number of unread events on the OIEP looking specifically for delete events (refer to the Event-Based OIEP Queued Events topic in the Data Exchange documentation here).

When there are no unread delete events or old indexes have been deleted, consider skipping the current unread events (via the Forward option) to reduce the time it takes for the processor to finish and avoid processing multiple events for each object (refer to the Event-Based OIEP Forward, Rewind, Purge, and Republish topic in the Data Exchange documentation here). Then republish all objects to generate one event for each object.

Be aware that if there are unread delete events and you keep old indexes, the Elasticsearch index will contain items that should have been deleted but were not because the delete events were skipped with the Forward action.

For example, publishing to an Elasticsearch database typically involves all products. Before republishing to Elasticsearch, first process delete events, Forward the remaining unread events, and then Republish all by selecting the top-level product node.

Delete Indexes and Republish

These methods remove all existing indexes for faceted search and reuse the active Index Prefix. None of the selected object types nor the data specifications are changed, but additional configuration may be required.

Note: When updating an active Elasticsearch Configuration to include the Asset Content and/or Asset Reference node fields as default facets, you must delete the indexes and republish to view the Asset Content and/or Asset Reference node fields on the Web UI Search Screen. If you are creating a new configuration, all configuration settings are implemented when you Publish to the Elasticsearch Database (refer to the section above).

  1. Identify the Index Prefix of your active Elasticsearch Configuration as defined in the Initial Configuration step of the wizard. For more information, refer to the Creating an Elasticsearch Configuration topic here.

  2. Choose a method to delete indexes:

    • Elasticsearch API: Refer to the directions at https://www.elastic.co/guide/en/elasticsearch/reference//current/indices-delete-index.html

    • Kibana – Dev Tools: Update the following query with your index prefix and run it to manually remove indexes for all context and workspace combinations. In this example, all context and workspace combinations with the prefix of ‘my-index-000001’ are removed.

      DELETE /my-index-000001*

    • Kibana – Manage Indices: From the Kibana Index management screen, from the ‘Manage Indices’ menu select the ‘Delete indices’ option.

  3. Generate events for the STEP data required in the Search Screen using the republish action as defined in the Event-Based OIEP Forward, Rewind, Purge, and Republish topic of the Data Exchange documentation here.

  4. Publish STEP data to the Elasticsearch database by invoking the event processor running the Elasticsearch Configuration as defined in the Running an Event Processor topic of the System Setup documentation here. New indexes are created using the Index Prefix in the configuration.

Create New Indexes and Republish

This method allows you to keep the current index for faceted search although it will no longer be active. Additional faceted search indexes will be created with a new Index Prefix. None of the selected object types and data specifications are changed, but additional configuration may be required.

  1. In STEP, use the Duplicate option on the Maintain menu to duplicate your active Elasticsearch Configuration. Alternatively, export the configuration using STEPXML, modify the ID of the configuration, and reimport the STEPXML.
  2. On the new configuration, change the Index Prefix of your active Elasticsearch Configuration as defined in the Initial Configuration step of the wizard. For more information, refer to the Creating an Elasticsearch Configuration topic here.
  3. Generate events for the STEP data required in the Search Screen using the republish action as defined in the Event-Based OIEP Forward, Rewind, Purge, and Republish topic of the Data Exchange documentation here.
  4. Publish STEP data to the Elasticsearch database by invoking the event processor running the Elasticsearch Configuration as defined in the Running an Event Processor topic of the System Setup documentation here. New indexes are created using the Index Prefix in the configuration.