1. data are the raw material for information that is exchanged between one or more 'senders' and one or more 'receivers'.
2. whether data really results into information is up to the 'receiver' not the 'sender'. The receiver interprets the data within his/hers context.
3. the exchange of data always takes place at a particular time and place.
4. direct exchange is the natural way of conversations, however nowadays we use technology to support a decoupling of time/place of sending and time/place of receiving. How ever 'near real-time' this exchange takes place, we need the information technology (IT) to support it.
This last statement requires clear definition of data that will be interpreted likewise by both sender and receiver. There is a separate area of expertise in the IT discipline that focusses on data, data definition, data ownership which rightfully spend many hours on this topic.
When we take a closer look at the sustainability of the various components in an architecture, we can observe the following
- the occupation of the various roles in a process by persons is the most sensitive to change
- next comes the design and implementation of a process
- the data is the least sensitive to change, because it is directly linked to the result of the process
There are several techniques and tools available for 'data management', I will not go deeper into this subject.
What I do think is very relevent is the notion of 'structured' versus 'unstructured' data. To put it simply: a piece of text has a structure based on the grammar of the language that it's written in, however for our objective (= assuring the integrity of Information Provisioning) we define this as 'unstructured' data.
A similar thing goes for information in a spreadsheet. A user (= sender) of a spreadsheet can create structure in the data by means of his/hers use of rows and columns, but spreadsheet software does not do anything to provide for the integrity of the data. Typos in the values in cells lead to unreliable results.
Only when we use relational database management systems, the software will provide for maintaining the integrity of the data. 'Data normalization' and 'Referential Integrity' play a key role in using a DBMS properly. However this implies a long and sometimes cumbersom process of data definition.
In todays practice many software developers ignore such an approach. They are usually focused on 'as long as the program provides what the user has asked for, it's oké'. The result of such an approach is 'oeps it doesn't work, lets do it again'. And the burdon of maintaining such software ever increases.
What is so special about your data?
The answer to that question depends highly on the strategic role you percieve your IT to have. I elaborated more on this in the articles on 'Strategy' in this site. When you consider IT to be an 'enabler', a unique Information Model will be a competitive advange by which you can make lots of money. When you percieve IT to be a 'facilitator' you may be fine in adopting the information model your software provider delivered implicitly in the (ERP) package you bought. But there is a lot of room between these two perceptions and I can assure you it pays off to spend some time on this subject.
An Information Modelling effort leads to a common understanding with the business on what is key to success of certain processes and what can be simplified and standardized. Such an activity leads to an Information Model consisting of an 'Entity Relationship Diagram (ERD)' with the definition and description of 'Entities' and their 'Relationships'. Next to that it distinguishes between Information Ownership and Information usage by all business pocesses involved.
Most organizations and in particular companies, act on 'Product / Market Combinations' So a generic Top Level Information Model will look something like this:
For an Energy Trading Company such an Information Model looks like:
To define Information Ownerhip, we used collors in this picture. A more accurate way to do this, is with a 'CRUD matrix' like below.
Along the vertical axis you see the data objects, along the horizontal axis you'll find the processes and in the cells you capture the roles of ther processes regarding the data objects:
Be aware: all pictures in this article are 'top level', this is just a beginning. You need to dig at least one level deeper for defining your structure in a relational DBMS.