Extraction Component
Extraction Component
The extraction component copies data from existing data sources. The data for copying has to be identified. If copying from operational data, when is that data at a point that merits inclusion into a data warehouse?
- A salesman considers a sale as done when an order is received.
- A stock manager considers a sale as done when the item is ordered.
- An accountant considers a sale as done when an invoice is raised against the order.
There is a 'life-cycle' to the order. Its status will alter as it moves through the system:
- Received
- Picked
- Complete
- Packed
- Despatched
- Invoiced
- Paid
An application process can be changed to recognize and capture the data into a temporary store for subsequent placing into the data warehouse. Triggers can be created to examine the status of the order. When the correct status is achieved the triggered statement can move the data about the order into its temporary table. This data will be the subject area facts, data about the sale.
Once the subject area facts are captured attention can turn to the star points of the dimensional analysis. Membership data can be copied from a relational database using an export utility program. Any new or updated member data can be recorded via a trigger to record the change (delta) into a sequential file for onward transmission to the data warehouse.
Product information needs to be copied. Arrangements may have to be made to enable the software to capture the data needed.
Area information can be acquired from outside agencies supplying geographic/demographic information. An alternative is to incorporate a program based upon major towns, counties or postcodes.
Not forgetting time, which is an absolute pre-requisite for a data warehouse!
Comments, suggestions, ideas to
Stuart Banner
