Data architecture definition 

Data architecture describes the structure of an organization’s logical and physical data assets and data management resources, according to The Open Group Architecture Framework (TOGAF). It is an offshoot of enterprise architecture that comprises the models, policies, rules, and standards that govern the collection, storage, arrangement, integration, and use of data in organizations. An organization’s data architecture is the purview of data architects.

Data architecture goals

The goal of data architecture is to translate business needs into data and system requirements and to manage data and its flow through the enterprise.

Data architecture principles

According to Joshua Klahr, vice president of product management, core products, at Splunk, and formerly vice president of product management at AtScale, six principles form the foundation of modern data architecture:

  1. Data is a shared asset. A modern data architecture needs to eliminate departmental data silos and give all stakeholders a complete view of the company.
  2. Users require adequate access to data. Beyond breaking down silos, modern data architectures needs to provide interfaces that make it easy for users to consume data using tools fit for their jobs.
  3. Security is essential. Modern data architectures must be designed for security and they must support data policies and access controls directly on the raw data.
  4. Common vocabularies ensure common understanding. Shared data assets, such as product catalogs, fiscal calendar dimensions, and KPI definitions, require a common vocabulary to help avoid disputes during analysis.
  5. Data should be curated. Invest in core functions that perform data curation (modeling important relationships, cleansing raw data, and curating key dimensions and measures).
  6. Data flows should be optimized for agility. Reduce the number of times data must be moved to reduce cost, increase data freshness, and optimize enterprise agility.

Data architecture components

Dataversity says data architecture can be synthesized into three overall components:

  • Data architecture outcomes. These are the models, definitions, and data flows often referred to as data architecture artifacts.
  • Data architecture activities. These are the forms, deploys, and fulfills of data architecture intentions.
  • Data architecture behaviors. These are the collaborations, mindsets, and skills of the various roles that affect an enterprise’s data architecture.

Data architecture vs. data modeling 

According to Data Management Book of Knowledge (DMBOK 2), data architecture defines the blueprint for managing data assets by aligning with organizational strategy to establish strategic data requirements and designs to meet those requirements. On the other hand, DMBOK 2 defines data modeling as, “the process of discovering, analyzing, representing, and communicating data requirements in a precise form called the data model.”

While both data architecture and data modeling seek to bridge the gap between business goals and technology, data architecture is about the macro view that seeks to understand and support the relationships between an organization’s functions, technology, and data types. Data modeling takes a more focused view of specific systems or business cases.

Data architecture frameworks 

There are several enterprise architecture frameworks that commonly serve as the foundation for building an organization’s data architecture framework.

Characteristics of modern data architecture 

Modern data architectures must be designed to take advantage of emerging technologies such as artificial intelligence (AI), automation, internet of things (IoT), and blockchain. Dan Sutherland, distinguished engineer and CTO, data platforms, at IBM, says modern data architectures should hold the following characteristics in common:

Copyright © 2020 IDG Communications, Inc.

Source Article