A look at a specific example of how a POS (point-of-sale) system created data quality problems, adding complexity to the company’s reporting and operations.

Whenever we work with a new client, either with our software product, The iBLeague™, or with our custom application development services, one of the early tasks we do is profile their data.  The data is always the part of the project that contains the most risk and the most unknowns, and invariably, we find more than a few data quality problems.

We recently activated a multi-location company on The iBLeague™ that provides both retail services and retail products.  We started by dumping the entire list of products and services from their POS.  This quickly revealed several data quality issues we would need to deal with:

  • A separate instance of the POS system exists per store, requiring every data extract and every other activity to be replicated for each store.
  • Unique product and service codes are enforced only within each POS instance, meaning duplicate product and service keys exist when combining data from multiple stores.
  • The POS could only accommodate a single price for each service code requiring our client to create a separate service within the POS for each employee that performed that same service.  This client currently has more than 250 employees, and every time a new employee starts, another complete set of distinct service codes needs to be created.
  • The first two digits of the product and service code represent its item class.  Often codes are difficult to change in POS systems meaning the “item class” isn’t always what it was intended to be.

We see these kinds of data quality problems over and over, and one set of issues often creates a whole new level of data quality challenges.  For example, the constraints of the POS system and the ways this client needs to manipulate data quality limitations in reports from the POS, forcing a lot of manual reporting in Excel every month.  This process is time-consuming, manual, and error-prone, which creates delays in getting critical information to decision-makers.  What started as a constraint in the POS system became a data quality problem which creates complexity in the business every day.

We started with more than 15,300 service codes pulled from the multiple instances of a POS system.  Our “data scientists,” who love poor data quality, scrubbed these codes down to just 17 item codes with 1,400 service codes (still a high number due to differences in spelling we didn’t scrub).  We then separated the employees into a separate employee hierarchy.  Now every transaction is extracted from the POS system and loaded into The iBLeague™, at which point it is associated with a distinct employee and a clean, nonreplicated service code.  This provides several advantages:

  1. Metric calculations are greatly simplified, allowing the client to include or exclude specific services very quickly.
  2. Reports are easier to create, filter, and run (e.g. ability to report on a single service comparing one employee to another employee).
  3. Scorecards can quickly be replicated across the organization for every location, department, and employee.
  4. Data is easier to understand, providing clarity into the business.

Identifying the state of data quality is a critical first step in every project, system, and set of data. Eliminating data quality issues is what creates value for others who are blinded or paralyzed it.

Question: How has data quality issues created business complexity for your company?  Please leave a comment below.