Data Integration Context
A data integration may be architected as part of a Solution or Capability Architecture development or as part of an Enterprise Architecture development. The TOGAF® Standard (see References) defines an Architecture Development Method (ADM), and its Enterprise Continuum provides an overall context for architectures and solutions, and classifies assets that apply across the entire scope of the enterprise.
One of the key aspects of Enterprise Architecture implementation is data migration from the existing to the target environment. The target environment is generally rationalized and streamlined, entailing a major data integration effort to migrate the data.
Most data integration architectures are in the organization-specific part of that continuum, guiding and supporting organization-specific data integration solutions. Portals, integrated information environments (An Information Architecture Vision: Moving from Data Rich to Information Smart; see References), information sharing environments, and emerging data fabric/data mesh concepts are common systems architecture concepts, and they guide and support the use of vendor and open source data platforms that are common systems solutions.
A data integration is typically preceded by planning, discovery, and extraction, and followed by verification before the integrated data “goes live”. From that point, it is subject to lifecycle management, which determines its storage, use, and eventual disposal. These activities are all carried out in accordance with information governance, as illustrated in Figure 1:
• Plan: includes representing information holdings and identifying shortfalls, such as a lack of requisite quality
• Discover & Extract: includes analyzing the shortfalls, finding sources and information quality (often through metadata), assessing privacy and legality of extraction, cost-benefit analysis, and getting the information required directly or indirectly (e.g., through data as a service)
• Transform & Integrate: includes transforming data into a standard enterprise format, assessing information loss and resultant quality, information sub-integration (e.g., multi-sensor), creating metadata for integrated data, and integration with existing information holdings
• Verify: includes establishing the legal/privacy/policy framework, determining whether the integration is legal, determining whether the integration is in line with enterprise values and brand, and determining the need to transform (e.g., anonymize) integrated data
• Use: includes operations, decision support, and analytics; for example, Business Intelligence (BI) and Machine Learning (ML)
<Figure 1: Data Integration Context>
In a complex case – for example, following a merger – there may be many data integrations carried out in parallel, and the integration may not be a “one-off” event, but a continuous process operating on data that is continually changing. This process may be partly or entirely automated. Establishing such processes can be a significant part of a Solution or Enterprise Architecture.
The Technology Architecture phase of a TOGAF architecture development includes the identification of appropriate technical standards. The standards described in this document are candidates for inclusion in the Technology Architecture of an architecture development that covers data integration.