Clustering

This is one of the topics that describes the architecture of the STEP solution. The full list is defined in the STEP Architecture topic here.

Clustering creates a STEP system with high availability (HA) and scalability. In a clustered setup, the load is distributed so that all members of a cluster take a fair part of the load. A successful cluster setup requires accurate configuration of many parameters: number of servers, bandwidth, network latency, etc. A fail-over strategy must be available for all of the essential components of the system.

Homogenous cluster nodes are recommended in a cluster setup. The simplest type of cluster is created by adding an additional application server that does everything the first application server does. It is recommended that clustered applications servers are set up as clones and kept on identical hardware.

Important: Clustering a STEP system is handled by the STEP application itself. It does not use clustering technology provided by a commercial application server.

Architecture

The following comments apply to the illustration below which shows a STEP system running as a cluster setup with three application servers.

  • Application Servers are physical servers running either standalone (using Oracle Java) or a WebSphere Server. Within each of the application servers, an instance of STEP is running, supporting all three types of clients mentioned earlier (workbench clients, web clients and DTP clients). Each application server computer has a cache of its own that is synchronized whenever write operations take place.
  • Oracle Database Server provides the primary storage for all information to be stored persistently by the STEP system. Optionally, this can be an Oracle Real Application Cluster setup to compensate for the risk of failure in the underlying hardware.
  • Shared Storage is a file system that is shared by all nodes in the cluster. The 'step/workarea' folder contains, for example, an image cache where thumbnails of images are generated. It also stores intermediate files generated by the STEP Workflow component and files used by background and batch processes. The hotfolders are dynamic folders that automatically process files based on a hotfolder configuration. A good example is uploading assets, where the assets are dropped into a folder and automatically imported into STEP. The hotfolders can be exposed directly on the internet making it possible to place content for import directly, for example, via FTP (not shown).

Application Server Roles

Application Servers in the cluster can be configured as a server for any of the following services:

  • Background services
  • Workbench client services
  • Web client services

The preferred setup is that all servers handling a specific type of service handle the same set of services. For instance, a single server should not handle background processing and workbench client processing, while another server handles only workbench client processing. In this case, the load balancer may not recognize the differences between the two servers and may put equal numbers of workbench clients on both servers even though one of them also serves background services.

Implementation

Running STEP in a cluster means that one instance of the application is running on each application server. Each of the application instances maintains a cache residing in memory on the corresponding application server in order to minimize the number of database requests on repetitive read operations. Whenever a client or a background batch process changes data via one clustered application instance, all of the other application instances must know about it and update their caches accordingly. The mechanism used for this cache synchronization is implemented in the JDO (Java Data Object) layer of the application and basically relies on the same implementation no matter what application server software is being used. In other words, the cluster implementation used by STEP does not use any of the application server specific clustering facilities provided by WebSphere.

Load Balancing

As shown on the architecture diagram in Architecture Layout topic (here), the Web Client is load-balanced through a hardware load balancer. Such a load balancer must support session affinity to ensure that the same session goes to the same server until the session times out. This is important since session state is not replicated among servers and a user directed to a different server will be asked to log on again.

In contrast, the workbench client has built-in load balancing capabilities building on the CPU load on each server. Once connected to a server, the client will keep using that server unless the client GUI has been idle for 15 minutes. The standard configuration for this applies for most setups. STEP needs to be provided with the list of server names as known by clients in the STEP configuration file.

The workbench client has no session state on the server, thus moving from one server to the other is not a problem, except for performance reasons (the caches on the server the user last accessed is most likely to contain the data the user will request next). The system will ensure that any writes to the database are viewed consistently across all application servers in the cluster.

Scalability

STEP is designed to support both large, initial deployments and the growth of smaller systems through horizontal scalability. A cornerstone of the design is to provide a cost-effective solution that supports both. Regardless if the number of users, the amount of data, or both should grow significantly, STEP is not likely to be the system that limits activities.

Serious and resource-demanding tests have been and are continuously executed to prove that the scalability potential of STEP meets these goals.