Frequently Asked Questions
Data management is a collection of concepts and practices used to manage data produced in organizations. Data management has grown more complex while the notion that data as a valuable resource to be refined has grown in popularity. In many cases, enterprise business operations without data analytics are simply not a feasible strategy to remain competitive.
A general view of data management shows it to have two layers: a technical layer, where the actual processing of raw data occurs by computer processes associated with data collection, transformation, and analytics, many of which can be automated; and a non-technical layer, where non-technical personnel are engaging more and more with data to draw insights that assist in their everyday tasks.
Data management as a discipline has many sub-categories, and in many cases requires their own specialization.
These subcategories address the forces and factors that influence an organization’s data management strategy:
- Data Governance — Data governance addresses the political nature of data.
- Data Architecture — Data architecture addresses which data is collected and stored.
- Data Modeling and Design — Data modeling and design addresses the technical structuring of a data model and designing databases that fit data to that model.
- Database and Storage Management — Database and data storage management address the active management, maintenance, and assurance of database infrastructure.
- Data Security — Data security addresses guarding data assets against bad actors who can steal, change, and destroy data.
- Reference and Master Data — Reference and master data addresses the consistency and continuity of disparate data sources across the organization.
- Data Integration and Interoperability — Data integration and interoperability addresses the technical movement and transformation of data from multiple sources in a unified view.
- Document and Content Management — Document and content management addresses the cataloging and managing of information held in various business documents.
- Data Warehousing and Business Intelligence — Data warehousing and business intelligence addresses data mining, data analytics, and data warehousing.
- Metadata — Metadata addresses the discovery and management of data about other larger data sets.
- Data Quality — Data quality addresses the distillation of raw data into usable forms.
Data management is the practice of collecting, storing, analyzing, and using data securely, efficiently, and cost-effectively. From a consumer protection standpoint, data governance refers to the efforts of businesses that collect and use data to remain compliant with regulations that protect the personally identifiable information of consumers. From a political point of view, data governance becomes both a national and international point of concern in protecting citizen information.
Data governance is a data management concern that focuses on preparing and keeping data that is regulated by various laws aimed at protecting consumer information. The GDPR (EU General Data Protection Regulation) and CCPA (California Consumer Privacy Act) are two sets of laws that regulate organizations that collect and use data.
The GDPR is a European Union act, and so U.S. companies will need to adhere to it when collecting and using data from EU residents.
The GDPR, which came into effect in 2018, regulates data controllers and processors, and protects personal data of residents, essentially any information that can be used to identify a person. While this legislation doesn’t extend beyond protecting EU residents, but does apply to foreign companies interacting with EU residents, the GDPR has provided a roadmap for many other countries and regions to protect personal information.
The CCPA is similar to the GDPR, but is more specific in what data is linked to personal information, for example, information at a household or device level. A second significant distinction is those who are regulated. The CCPA is more narrow in who meets its regulation requirements. For example, a for-profit entity must meet one of three requirements to be regulated.
There are many facets of data management, as defined by its subcategories, however, the overarching benefit is to increase control and visibility into data assets within an organization. Without any form of data management, companies flounder in operations and growth, falling behind in competitiveness, and eventually out of the market. Modern data management systems, though, are ready made for many organizational cases, and provide several benefits that branch from improved data control.
- Improved Data Quality and Accuracy — Impurity is a characteristic of most natural resources, one that data emulates. Using effective data management helps to clean and distill raw data and improve quality, ensuring that analysis can begin with the most accurate and usable data.
- Reduced Time and Cost — Companies burn resources managing their data, and burn time and money trying to overcome the blindspots that quality data insights may have foreseen. Modern data management software reclaims time and costs in collecting, managing, protecting and analyzing data. The insights drawn from data analysis bolster other processes and systems, improving performance and efficiency, and improving the effectiveness of time and dollars spent.
- Eliminated Data Redundancy — Redundant data introduces risks and weaknesses into an organization’s data system, typically in the form of data inconsistencies. Data inconsistency occurs during data operations, like the insertion, deletion, or adjustments to data entries, that result in anomalies that degrade data integrity.
- Guaranteed Data Compliance — Regulation of personally identifiable information has made data compliance a top priority for many businesses. Failure to comply can be met with penalties as well as reputational damage to the brand. Data management equipped with data governance functionality can help track and guarantee data compliance and protect the business.
- Informed Business Decisions — A key design principle in data management software is the unification of disparate data and information for analytics. By bringing the management of data and data sources under central monitoring and control, organizations are provided a holistic view up-to-date views of their data usable in real-time operations.
- Single Source of Truth — All the features of data management come together to provide a single source of truth (SSOT) for the company. SSOT architecture is structured in such a way that every data point is managed and controlled from a central point, and data then can become normalized, or made canonical. Changes made to this canonical version of data, then ripple throughout the entire data system, providing the latest version to every connected business system.
Data management systems are inherently domain specific. In many cases, an organization will maintain multiple data management systems each serving a specific domain of their business. Having multiple systems leads to data siloing, which can result in less data transparency as data is locked up in these silos, however, by exploiting data siloing data architects may better protect certain sensitive data.
The following array of data management examples illustrates the need for master data management practices and system integrations.
- Product Data Management (PDM) — PDM software is designed to manage the design and engineering process data for developers through a unified dashboard. Product data information can include product specs, version control, change orders, bills of materials, vendors/suppliers, schematics, etc. PDM can and often does integrate with product information management (PIM) systems.
- Product Information Management (PIM) — PIM is a complementary data set to what is typically tracked in product data management systems. Whereas, PDM tracks data about the product, its development and manufacturing, PIM systems use select portions of that data in marketing operations, such as delivering product marketing materials to websites, advertising channels, marketplaces, social media, and partner platforms.
- Customer Relationship Management (CRM) — CRMs are data management systems designed to store and organize customer data including personal data, sales leads, sales conversions, revenue data, offers and subscriptions, renewals, etc. Sophisticated CRMs can also track relationships with clients, communications, historical information, and employ data analysis to study buying patterns and other large data sets that pertain to sales and marketing.
- Master Data Management (MDM) — Master data management addresses the growing number of data management systems found under one roof, and the ensuing data inconsistencies inherent in aligning different data sources. MDMs act as an umbrella management system that provides the tools and processes to unified disparate data and eliminate inconsistencies.
Cloud data management combines the advantages of cloud services with the power to manage data across cloud platforms. Cloud advantages include resource scaling, disaster recovery, anytime anywhere access, backup and long-term storage, and cost controls. Cost controls are particularly beneficial for business, either small or enterprise, and grant both of them the ability to pay for resources as needed. Typically cloud providers will be responsible for maintenance in the cloud relieving those worries from businesses.
In other instances, organizations can integrate their own resources.
Multi-environment compatibility is another important capability of most cloud data management vendors. Data can be shared and integrated across private and public clouds, providing access to on-premise storage.
Master data management (MDM) is a solution intent on bridging the gaps between multiple domain specific data management applications within an organization. Today, businesses small to enterprise can use tens to hundreds of these types of data applications with little common ground between them to make easy meaningful connections. MDM platforms do the work of tying these applications together.
Essentially, MDMs do this by describing core entities in a business, data that other data management applications can draw upon, knowing that it is the master record and the most relevant and accurate. While core entities are chosen specifically for the business profile, some of the most commonly described entities are customers, prospects, suppliers, products, locations, etc. A master record of these core entities ensures accurate data throughout every system.
Data management requires a thorough look at the data requirements of a specific domain. However, general best practices, like the ones below, can help circumvent potential challenges.
- Establish a data discovery layer — A data discovery layer enables the searchability of all data sets.
- Streamline data science tasks — Data transformation overhead should be streamlined as much as possible to ensure real-time data analysis.
- ·Automate and use AI — Automation and AI can help teams proactively monitor database queries and optimize indexes. Automation emancipates team schedules and enhances efficiencies.
- Leverage data discovery to maintain compliance — Data discovery enables teams to maintain up-to-date compliance with multiple jurisdictions. As regulations increase globally, this auditing task will become more vital.
- Use converged databases — Converged databases allow native support of multiple disparate data types and platforms.
- Align your database platform with business needs — The main goal of analyzing data together is to see a larger picture of the business. Speed, accuracy, and reach need to match those of the business. Enterprises may need split second analysis, while smaller businesses not as much.
- Use a common query layer — Like converged databases, using a common query layer enables data scientists, and others, to access data from any source without needing to know its location.
Data management is the foundational step in effective data analysis. Organizations implement data management techniques when they view data as an asset to the business and an opportunity to find actionable insights. To this end, proper data management is important for several reasons:
- Visibility into organizational data
- Data analysis and business insights
- Reliability supported by established processes and policies
- Data security and protection
- Scalability as data accumulates
In retail marketing, this may mean combining mobile, online, and offline sources to build customer profiles and automate remarketing initiatives. In manufacturing, this may mean combining layers of IIoT technologies with ERP and PDM systems to automate whole factories. In most cases, data platforms solve an enterprise systemic problem, namely the multiplication of data systems within the company creating data silos that hinder innovation and cause inefficiencies.