Thursday, May 27, 2004

Semantic integration

Another article from David McGoveran[1] (gotta admit he writes some good stuff) notes that the primary types of semantic transofrmation are:



*
Combining two or more fields having different data types
*
Decomposing a field into two or more new fields
*
Aggregating multiple values
*
Disaggregating aggregate values
*
Consolidation
*
Synchronization
*
Generalization
*
Sub-typing

I include them here so I don't lose them.

He goes on to state something that fits in with my thoughts regarding TDQM (Total Data Quality Management) although my thoughts extend outside of an enterprise and really drive towards a supporting infrastructure for TDQM rather than the processes and responsibility questions that are addressed by TDQM.

Here are the elements of TDQM:


1. Fill the repository incrementally, never “upfront.” This is a pragmatic, not academic, effort.
2. Ensure that the repository supports a theory of semantic types. It should define data semantics by capturing constraints and not just data syntax, and relate them to existing types through dependencies.
3. Don’t accept application software unless data semantics has been defined in an importable data model.
4. Use versioning, partitioning, and type relationships to organize metadata. Never delete it.
5. Commit to driving application development and integration projects from the repository.
6. Use data integration tasks as opportunities to use, refine and validate the repository.

These are all very good points and certainly make a lot of sense. #3 is a tough one for companies to enforce, and we probably need to think of compensating actions to take for those instances where the absolutes are not an option.

I'd also take #5 a lot further to say that it is not just driving application development and integration projects from the repository, that a lot of the applications themselves can be made more valuable to the users by leveraging the repository. There is a lot of value in having this information available and we need to find ways to leverage it.


[1] Data Integration, Part VIII, eAI Journal, January 2003








Kipp Jones - CTO
nuBridges, LLC - www.nubridges.com
eBusiness is Business
cell: 404.213.9293
work: 770.730.3722



No comments: