View All News Items

Improving Data Quality: Selecting the Right Software - Tuesday, August 10, 2004



When you turn on the car ignition and the engine belches out black exhaust smoke, you know you have dirt in the engine. When you turn on the faucet and see rusty water gushing out, you know the pipes are rusty.
When you see your customer information spread out 14 columns wide on a crisp computer printout, do you know if the data is accurate, complete, consistent, or if "dirty" data will obfuscate decision making?

As companies seek information-based or knowledge-based decision making, they need to acquire, invest in and manage data quality on an ongoing basis.

Data quality is something a company typically "backs into". This is because it is not aware of the data quality issues in the beginning of transformation from an ad hoc information management style. Especially where the data variables are few, to an automated, complex information based management style that relies on a number of information variables linked across several internal systems and external lists.

Some confuse "data availability" with "quality of data" and assume that because they are collecting data in various operational points of the business, this data is valuable. For example, if they collect data of who is buying what from the company, where, and how much the customer has spent to buy the product, the resulting data can be leveraged to understand the customer better, and act for more profits.

As a company succeeds, expands its product lines, and attracts customers in larger numbers from a greater variety of distribution channels, customer data integration and quality become major issues. Suddenly, the company discovers that the data used to design the next product or service fails to draw raves from the same customers and the company is at a loss to understand what went wrong.

Data has become unreliable and unpredictable. Upon examination, data on any customer no more provides full 360-degree vision, as some of new product systems have simply failed to reliably link its information to the flagship product.

Success has introduced additional data complications, as new categories of customers have been attracted and products used by mid to large size businesses are now being used by small proprietary businesses and individuals for home offices. Data quality enhancement initiatives become paramount for company’s success, as it seeks to execute efficient knowledge-based systems decisioning to reduce costs and maximize profits.

Data quality is at risk with greater source diversity and complexity of data. Data quality can be said to indicate a value we can assign to the data for its consistency and completeness, i.e., how accurately has the data been identified, classified and categorised and how dependable is data to help decision-makers and predict results of knowledge-based actions such as loyalty marketing or fraud management. Lack of data quality will result in bad decisions, higher cost, poor service, and missed revenue opportunities.

Data Quality For Pvt Sector & Govt
When there is only one customer and one product, it is easy to say we are not worried about data quality. The behaviour of a customer is well known by all within a company and the bond between customer and business is strong.

As the number of customers grow and as the number of products grow, the complexity of information increases exponentially, and cannot be handled efficiently without automation and systems computerisation. Automation here is not about cost reduction. It is about reliable, high quality, predictable operations for a complex set of business information.

To quote Ted Friedman, principal analyst at Gartner, "Many organisations worry about the plumbing (i.e. the tools), but tend not to think about the quality."

The costs of such negligence can be staggering. It has been estimated by The Data Warehousing Institute (TDWI) that poor data quality costs US businesses more than $600 billion a year.

Indian companies, which seek to make most of such learnings, and position themselves for taking the lead in managing world-class operations at home or to execute profitable business process outsourcing (BPO) operations for customers in USA or UK, will need to understand about information-based business processes, understand the gaps they have in data and the quality of data they have, and the aggressive move up the curve to become adept in collecting better information, cleaning, standardising, classifying and collating information; and finally using this information to make smart business decisions.

As countries become more open to an inter-dependent world and competition heats up in a global marketplace, not just profit seeking companies but governments and societies at large should be concerned with data quality Data quality’s role in good governance and society cannot be ignored. Not in a world where citizens are increasingly concerned about public safety, anti-terrorism watchlists at ports and borders, anti-money laundering initiatives, where ability to match and categorise name, address and other identification data with 100 per cent accuracy is of paramount importance.

Data Quality & E-governance
For smarter governance, imagine the state electricity board that knows everyone of its customers and how much energy it uses, and at what time of day or season.

With such data, future demands can be predicted, people who steal energy can be apprehended and the business environment can be made more dependable. Areas like taxation and customs or excise duty collection can be made fairer with better data collection and analysis, as tax defaulters are identified and payments are better enforced.

Data quality is not a one-off exercise and many organisations around the world make this mistake.

They spend a lot of money to clean up their data but do not establish an ongoing process to keep the data continually clean and organised. So what starts as a great success soon deteriorates.

Data Decay & Cost Of Neglecting Data Quality
Data decay is a term used to denote how quickly data becomes unusable for an organisation.

For example, if a customer moves and his contact and other details are not updated, he cannot be contacted for future products, payment of bills, etc.

Data decay rate can be surprisingly rapid, and in the US for business-to-business marketing situations, data decays at a rate of 2.5 to 3.5 per cent a month. Within a year, one-third of the information in a database can easily become outdated.

Upwardly mobile professionals and consumer segment information similarly decays at a rapid rate, as they change jobs.

It is well known that retaining existing customers can be as much as six times more profitable than acquiring a new customer.

In many businesses, the lack of a loyalty marketing programme where the company has intimate knowledge of a customer’s behaviour, results in attrition and customer churn.

Not addressing issues of data quality adversely impacts not just revenue initiatives, but also results in higher costs, fraud, and lower customer service.

Investments in CRM tools and business intelligence do not have the desired payback as the data continues to be "dirty".

Call centres are still unable to link the entire household and family information when any of the household family members call in. The risk managers still require an army of analysts with unpredictable end results as their fraud management systems throw up too many “false positives” in view of the inability to enhance data quality of both their watchlist database and application data.

In short, IT and CRM investments do not see their payback, and business process re-engineering fails when data quality continues to plague business and public organisations, often with deadly silence and managerial inattentiveness.

(The author, Sanjib Mallik, is chief architect of CIANT Corporation with operations in Texas, New York and Kolkata and customers in USA, Mexico, Brazil, and India.)