Healthcheck Test Index

Healthchecks assist users to identify and resolve configuration and data issues that can negatively affect system performance.

Healthchecks are executed or skipped based on the database in use and/or if in-memory is enabled on the STEP system, so not all healthchecks will run on every system. As available, healthchecks can be reviewed and run from the following locations:

  • In the Admin Portal on the Healthcheck tab, users can run tests and review detected problems as needed. For more information on the Admin Portal Healthcheck tab, refer to the Healthcheck topic here.

  • For on-premise systems, healthcheck information is stored on the application server at [STEPHOME]/diag/healthcheck (for example, opt/stibo/step/diag/healthcheck). This information is automatically included when sending a diagnostics package to Stibo Systems Support.

  • From the Start Page, the STEP Performance Analysis link displays 12 weeks of results from all scheduled healthcheck results. Some unscheduled healthchecks are long-running or are only useful for Stibo Systems Support, and so they are not available in the Performance Analysis tools.

Healthcheck Tests

The following tables include all available Configuration, Data Error, and Performance healthchecks.

Note: Not all healthchecks are applicable for all STEP systems. On your system, only the healthchecks that are valid are displayed in the Admin Portal, on the application server, and in the Performance Analysis tools.

  • The 'Automated Fix' column indicates if a script is available to resolve the reported issue. To apply a scripted fix, contact Stibo Systems Support for assistance. If no automated fix is available, manually update the reported data or configuration.

  • The 'Runs on Schedule' column indicates that the test is run on the schedule defined in the sharedconfig.properties file on the application server, as defined in the STEP Performance Analysis topic here.

Performance Healthcheck Name Severity Description Automated Fix Runs on Schedule

Business Rule Execution Time Too Long

High

Performance can be impacted when business rules run too long. Business rules may run too long when there are too many operations combined into one rule, or when accessing too many objects, or when accessing objects with too many revisions, or when configurations call external services with a slow response time, etc. This is not always a problem, for example, if an IEP using the Business Rule Based Message Processor runs too long when processing batches, performance is not necessarily impacted if large transactions are not being written to the database. Most often, business rules taking longer than one minute require examination of the rule itself for performance improvements or the objects involved for data management. By default, business rules that run longer than five (5) minutes are reported as a healthcheck warning and business rules that run longer than 15 minutes are stopped and generate an error.

No

Yes

Change Log Entries Per Node Low When modifying certain objects in STEP, a change is written to the change log of the object. You can set event queues in STEP to monitor on these events and via, for example, an integration endpoint, information about the change can be exported. An event is put the queue for each interested event queue. Every time you modify an object that generates an event, STEP tries to limit the number of log entries for that object. If more than 20,000 changes are logged for the object, it attempts to delete the old events. However, the attempt only succeeds if there are no event for the change. If you have more than 20,000 change log entries for an object, determine why the events are not being processed. Yes, in most cases No
Change Log Total Size Critical Checks if the change log has grown too large. This can cause Oracle to perform poorly. The maximum number of rows allowed in the table is 100,000,000. Yes No
Check for Common Web UI Configuration Errors Medium Checks Web UI for some of the most common configuration errors that can cause performance problems. No No
Data cleanup tools High To maintain a system that performs well, regularly scheduled data clean-up is highly recommended. This healthcheck detects problems with the configuration of a scheduled background process to empty the recycle bin and/or an event processor to purge old revisions. Configure 'Schedule Empty Recycle Bin' (here), running at least monthly, and including all contexts in use by your system, to purge items in the Tree recycle bin. Enable an event processor of type 'Revision Management' scheduled to run frequently, with Purge Across Workspaces set to Yes, and Number to Keep set to not much more than 100, to reduce the number of unnecessary revisions. No Yes
Hard Evicts High A hard evict is a forced attempt to remove persistent objects from their cache. Hard evicts can happen when a task is holding many persistent objects for a long time without committing the transaction. In such a case, a hard evict may be executed to make room for caching other persistent objects. This can negatively affect performance since the cache of persistent objects may become less effective. A cause of hard evicts could be one or more business rules accessing too many objects. No Yes
Large Commit High

For Cassandra systems, large commits decrease system performance and increase the risk of concurrency problems. Very large commits may fail.

No Yes
Leaked Changelog Rows High Checks if there are leaked rows in the change log table. If there are many leaked data rows, performance will be negatively affected. Yes Yes
Optimistic Lock Recovery High Reports that optimistic locking errors were detected when flushing to the data store. This indicates that some objects were concurrently modified in another transaction, or a constraint error occurred. This can negatively impact performance. Repeated occurrences of this may cause the transaction to eventually fail. Resolve this by avoiding and/or minimizing concurrent modifications of the same data. Yes Yes
Too Many Associated Objects High When there are too many associated objects, degradation of performance is possible because the amount of data exceeds the threshold for caching the given relation. This could be due to too many children, references, referenced by values, or multi-valued attributes. No Yes
Too Many Attributes Linked (Directly Not Via Inheritance) to a Product/Classification Medium Finds all products / classifications that are directly linked (not inherited) to more than 1,000 valid attributes. More than 1,000 links can cause performance issues when opening the References Editor in workbench. No Yes
Too Many Background Processes for an Integration Endpoint High Checks if there are too many background processes for an integration endpoint. Too many BGPs for an IEP can degrade performance. Clean-up of old BGP files and folders is required to resolve this issue. No Yes
Too Many Manually Sorted Attribute Groups Medium Checks that no manually sorted attribute group has more than 10,000 children. Only the front revisions are considered and children in all workspaces are counted. No Yes
Too Many Manually Sorted Products and Classifications Medium Checks that no manually sorted attribute group has more than 10,000 children. Only the front revisions are considered and children in all workspaces are counted. No Yes
Too Many Qualifier Relations Low Find all qualifiers that are used in too many pseudo qualifiers. Performance problems can result from having a large number of pseudo qualifiers if a real qualifier is linked to large number of pseudo workspaces because, by default, the application cache only caches 10,000. Refer to the property: Install.DataCache.MaxRelationSize=10000. This plugin cannot remove the duplicates, but another plugin can remove the unused pseudo qualifiers. No No
Too Many Revisions for a Node High Checks if there are too many revisions for an object. More than 10,000 revisions can cause performance issues because the amount of data exceeds the threshold for caching. No Yes
Too Many Valid Values for List of Values Medium Checks that no list of values has more than 5,000 valid values. Large lists of values (LOVs) make it difficult to find, search, select, and filter on values. No No
Too Many Values for a Node Medium Checks if there are nodes with too many values, which can cause performance issues. No Yes
Too Many Workspace Relations Low Finds all workspaces are used in too many pseudo workspaces. If, for example, a node is visible in the Main, Approved, and Staging workspaces, a pseudo workspace representing these three workspaces is created. Performance problems can result from having a large number of pseudo workspaces if a real workspace is linked to large number of pseudo workspaces. The application cache, by default, only caches 10,000 pseudo workspaces. Refer to the property: Install.DataCache.MaxRelationSize=10000. While this plugin cannot remove the duplicates, another plugin can remove the unused pseudo workspaces. No No
Unused Pseudo Qualifiers Low Finds all pseudo qualifiers that are not used. Performance problems can result from having a large number of pseudo qualifiers if a real qualifier is linked to large number of pseudo qualifiers. The application cache, by default, only caches 10,000. Refer to the property: Install.DataCache.MaxRelationSize=10000. Missing qualifiers are only reported when at least 5,000 unused qualifiers exist. Yes No
Unused Pseudo Workspaces Low Finds all pseudo workspaces that are not used. If, for example, a node is visible in the Main, Approved, and Staging workspaces, a pseudo workspace representing these three workspaces is created. If you create new workspaces, many new pseudo workspaces can display many combinations of data. In this case, the result is a lot of pseudo workspaces, while any of these combinations are not always used. Performance problems result from having a large number of pseudo workspaces if a real workspace is linked to large number of pseudo workspaces. The application cache, by default, only caches 10,000 of these. Refer to the property: Install.DataCache.MaxRelationSize=10000. Yes No
Data Error Healthcheck Name Severity Description Automated Fix Runs on Schedule
Assets Without a History Entry Medium Assets cannot be found or viewed in the workbench or in Web UI because there are no visible entries in the history. No No
Attributes / Products Which Cannot Be Approved Medium These attributes and/or products cannot be approved. No Yes
Attributes That Have Both Revised and Not Revised (Externally Maintained / Not Externally Maintained) Medium Finds attribute values where the workspaces are not in agreement. For example, attribute values that are visible in all workspaces and are not externally maintained, or attribute values that are not visible in all workspaces and are externally maintained. Yes Yes
Background Processes Incorrectly Linked to Integration Endpoint High Verifies if BGPs are correctly linked to an integration endpoint. If not correctly linked, this can cause multiple pollers to be started for IEPs which can cause issues. Yes Yes
Check LOV Used for Status by BGPs High Checks for duplicate values in the LOVs used for the status of BGPs. Duplicates can cause locking errors when setting the status of the BGP. Yes Yes
Check Sequences High Identifies when a production database has been incorrectly copied to a Test / QA system, without including Oracle sequences. Other elements like views might also be missing. No No
Cycles in a Translation Graph Medium Finds nodes where the translations are caught in an infinite loop, like when the source and target of the translation are the same language. Yes Yes
Dual Visibility Values With Different Values Low Generates a list which can help to decide which values should be deleted / kept. Same as 'Values with Dual Visibility' except only duplicates with different values are listed. This list should be viewed by the customer before deletions. No No
Duplicated Contexts Low Finds duplicated contexts. Yes No
Duplicated Contexts, Their Matching Names and Workspaces Low Finds duplicated contexts, including their names / IDs, and workspaces. It can be time-consuming to include the names and workspaces. No No
Duplicated Edges Medium Searches for duplicated references / links with identical parents and children in the same context. No Yes
Duplicated History Entries Medium Checks if there are duplicated entries in the history table. No No
Duplicated PrivilegeRule Ownership Critical Finds Privilege Rules that are erroneously shared between multiple User Groups. Inconsistent behavior can result when editing the Privilege Rule and it prevents In-Memory system from starting up. Yes No
Duplicated Workspaces High Finds all pseudo workspaces that are duplicated. If, for example, a node is visible in the Main, Approved, and Staging workspaces, a pseudo workspace representing these three workspaces is created. However, if another pseudo workspace represents the same three workspaces, it is a duplicate. Performance problems can result from having a large number of pseudo workspaces if a real workspace is linked to a large number of pseudo workspaces. The application cache, by default, only caches 10,000. Refer to the property: Install.DataCache.MaxRelationSize=10000. While this plugin cannot remove the duplicates, another plugin can remove the unused pseudo workspaces. No No
Edges With Invalid Revisability Medium Finds link types that are marked as revisable but where unrevised links exist or link types that are marked as unrevised but where revisable links exist. Yes No
GDSN Subscription Not Linked to a Datapool High Checks if there are subscriptions that are not linked to a datapool. No No
GDSN Subscription With Invalid GPC High Checks if there are subscriptions with invalid GPC codes. For example, GPC codes for which there is no matching classification in STEP. No No
Invalid Previous and Maximum History Revisions Medium Finds history rows where previous max of new revision row is not equal max revision of previous revision row. On Cassandra systems, this healthcheck identifies data inconsistency issues. Data inconsistency can be caused by JavaScript that catches exceptions but does not rethrow the exception to handle the exception properly, causing partially committed actions writing to the database instead of being rolled back. Refer to the healthcheck 'Javascript catch without rethrow' for possible identified issues. No Yes
Invisible Deleted Products Where the Deletion Cannot be Approved Medium Identifies when there is a mismatch between two internal tables in STEP for history and references / links. If there is a mismatch, approval will fail and a message such as 'Cannot subtract Main from PWSpace...local to 203, pseudo does not contain the workspace.' is displayed. No Yes
Node Collections From Missing Parents Medium Finds collections that are missing a link to the parent. This issue has been found with temporary collections that are used by BGPs. When the user attempts to delete the BGP, files are out of sync and an error is thrown. Yes Yes
Nodes Having Multiple Parents Medium Identifies nodes with multiple parents. Yes No
Orphan Products Medium Finds products that exist outside of the product hierarchy. Yes No
Packaging Hierarchy Loop Critical Identifies when a circular reference is created in the packaging hierarchy, which can cause performance issues or throw an unhandled exception. This report identifies where the user can correct the data issue, preventing future performance issues. No Yes
Pollers Started by a Different User Than the One Configured in the IEP High Checks if there are pollers started by a different user than the one configured in the IEP. This can cause new revisions being generated each time the IEP is invoked, which can in turn cause performance problems. Yes No
Revised Node Missing a Front Revision Medium Finds all revised attributes that have unrevised values. This occurs when an attribute is changed and the update to values is interrupted, leaving some values revised and other values unrevised. Yes No
Revised Values Should be Unrevised High Finds all revised attributes that have unrevised values. This occurs when an attribute is changed and the update to values is interrupted, leaving some values revised and other values unrevised. Yes Yes
Search for Duplicated Qualifiers Critical Searches for duplicated qualifiers for modified values used for export and/or publication. If the value inconsistency exists for a long period of time, data can become corrupt. No Yes
Softvalues With Dual Visibility Medium Searches for duplicated softvalues. No No
The Object Has One or More (LOV) Attribute Values in a Deleted Context Medium Searches for objects with LOV attribute values present in a context that has been deleted. This may cause an approval to fail. No Yes
The Object Has One or More Attribute Values in a Deleted Context Medium Searches for objects with attribute values present in a context that has been deleted. This may cause an approval to fail. No Yes
The Object Has One or More Dimension Dependent Attribute Values in a Deleted Context Medium Searches for objects with dimension dependent attribute values present in a context that has been deleted. This may cause an approval to fail. No Yes
The Object is Named in a Deleted Context Medium Searches for objects with name(s) present in a context that has been deleted. This may cause an approval to fail. No Yes
Unrevised Values should be Revised High Finds all unrevised attributes that have revised values. This occurs when an attribute is changed and the update to values is interrupted, leaving some values unrevised and other values revised. Yes Yes
Value Link With No Owner Medium A data inconsistency for values using either List Of Values validation or older soft values, such as some used for system attributes. Applying the fix deletes value links that have a non-existing node. The fix operation does not support fixing individually selected problems - all fixable problems are fixed if the fix is started regardless of which problems are selected. Yes Yes
Value Link With No Value Model Medium A data inconsistency for values using either List Of Values validation or older soft values, such as some used for system attributes. Objects with this problem fail to load. Applying the fix recreates the missing value model objects where possible. The fix operation does not support fixing individually selected problems - all fixable problems are fixed if the fix is started regardless of which problems are selected. Yes Yes
Value Missing Content Medium This check finds and fixes problems with missing entries in BLOB tables. Yes Yes
Values Have Not Been Marked Correctly Deleted Low Finds all nodes / attribute combinations with values that are marked as deleted but that are still visible. Yes No
Values With Dual Visibility High Identifies if some of the duplicates have different values which can lead to unexpected behavior in the workbench or Web UI. The two values are randomly displayed and STEP appears unstable. No No
Configuration Healthcheck Name Severity Description Automated Fix Runs on Schedule
Hidden Oracle Parameters With Non-Default Values Medium Lists hidden Oracle parameters with a changed default value. The default value of a hidden parameter should only be changed when recommended by Oracle or Stibo Systems Support. No No

JavaScript Catch Without Rethrow

High Identifies business rules that do not correctly handle exceptions in try-catch statements. When catching an exception in JavaScript business rules using try-catch, only checked exceptions that have been declared in the Stibo Systems Scripting API are safe to catch without a rethrow of the same or another exception. All runtime exceptions should be rethrown. For some runtime exceptions, this will be strictly enforced so that if the business rule completes successfully, the exception will be rethrown by the framework when omitted in JavaScript. This protects against possible database inconsistencies that occur when the rethrow is omitted. If an API method partially completed a change when the exception occurred, the database transaction needs to roll back by letting the exception fall through the execution scope of the transaction. When issues are reported in this healthcheck, the system-detected missing rethrow(s) and the reported business rule(s) need to be revised to include a rethrow of the same or another exception. No Yes
Non-Compacted Attributes Medium Identifies attributes that are not using the compact storage model. Compacted attributes (excluding LOVs) take up less storage space than non-compacted attributes, which results in reduced I/O during read and write, improving response-time of the system. Additionally, for customers migrating to Stibo Systems SaaS (Cassandra), it is a prerequisite and may take multiple days to complete. When issues are detected in this healthcheck, review the attributes reported and start the migration to compact soft values, or convert the attributes to LOVs, where feasible (many usages and few distinct values). Refer to the Attribute Value Migration topic (here) for prerequisites and technical migration details. No Yes
Reflection usage in business rules High This healthcheck detects the use of reflection in business rules when they are executed and a warning with the text 'Attempted to call reflection API...' is written to the step.0.log. A future STEP release will block the use of reflection. For any necessary methods or functions that are not covered in the Public JavaScript API, enter an enhancement request in the Stibo Systems Service Portal for review and approval. Also, plan to rewrite the business rules identified by this healthcheck when preparing for a future upgrade. No Yes

Residual Events for a Queue

Medium Identifies event queues with events not being processed. When a queue-based event processor or outbound integration endpoint is set to 'Read events' but is not enabled, or is stopped in error, or is enabled without a schedule being run, large numbers of events can build up, which can negatively affect performance. If this issue is detected, inspect the objects in the report and make configuration changes to either consume the events or discard them. No Yes