Data Preparation

Data preparation is a process of exploring, combining, cleansing, deduplicating, and transforming data prior to business utilization or consumption.

Stibo Systems Data Preparation provides the ability to preprocess data, in isolation, before it becomes master data, in support of several use cases, including:

  • Indirect customer processing – Resolve indirect customer master data entities from transactional data provided by direct customers (distributors), enabling Consumer Packaged Goods (CPG) customers and manufacturers to gain insight into their customers' customers.
  • Prospect list processing optimization – Allow marketers to qualify and prepare prospect lists, thus reducing marketing campaign costs and speeding time to market. For more information, refer to the List Processing topic here.
  • Mergers & Acquisition (M&A) support – Manage the deluge of data inherent with M&A activity.
  • Optimal master data ingestion – Prepare master data prior to onboarding into STEP.

  • Explore and examine – Process data to derive business insight and assess data quality by understanding what problems exist and what corrective actions need to be taken. Data profiling provides useful insights such as which values are frequently used, which are missing, and which are rare, metric scores, patterns in data, and other points of interest. These data points help users better understand data and what needs to be done to prepare that data for ingestion.
  • Clean and standardize – Correct data errors and validate data against authoritative sources (such as address verification) ensuring data is accurate, up to date, and fit for purpose.
  • Eradicate duplicates – Remove duplicates from a single data set, across multiple data sets and compare against master data.
  • Resolve master data entities – Reconcile and augment master data entities.
  • Transform and enhance – Transform and enhance list(s) by filtering, transforming, and enriching lists.
  • Business user ready – Enable business users to leverage STEP data quality capabilities without the involvement of IT.
  • High performance – Use heavy parallel processing, lightweight to delete, simple to retry.
  • Optimal architecture – Process data in isolation from master data. Because data is stored on the app server file system (not in memory), the architecture helps drive performance.

For more information about data preparation and list processing, click on the video below: