Storage Structures
Handling blocks
Each block corresponds to a particular disk sector.
Block headers and footers contain information for use by the file manager for structuring the block. The header contains the block number of the block. The footer contains row location information in the form of a row identifier (RID).
A DBA may allow for freespace within a block to allow for new data to be inserted. Where a DBA does not allow for freespace, this may be an assumption that there will never be any new data for that block.
Each row is contained in a storage record. The record will have a prefix containing control information including the row identifier. This prefix will be removed by the file manager prior to the data being passed to the DBMS. Conversely the file manager will attach a prefix to any data passed back form the DBMS.
If a row is deleted the storage record remains. The prefix will contain information indicating that the space is available for re-use.
The internal structure of a block can be specified. Fixed length storage records tend to occupy more disk space. Variable length storage records can allow for more records to be placed in a block. However, if a record is deleted, a new record may then be too large to fit into the space left behind. Although a home can be found for the new record there is a dead space where the old record used to be. In addition the record prefix will have to contain information of the bytes in use by the variable type of record.
Indirect addressing of a block
The first component of an RID is the block number. The second component is an offset from the end of the block that locates a field in the block footer. This field contains an offset from the beginning of the block. This offset specifies the location of the record within the block. This indirect addressing allows for the movement of a record up and down within the block.
If a record is moved up or down the RID does not change. It is only the offset in the block footer that will change.
Pointer Chains
A means of locating the rows of a table. If a database contains more than one table, the storing of these tables may cause the rows to become interleaved within the same block(s). An RID will identify a row. A pointer chain is composed of the positioning within the record prefix the RID for the next row in that table. The first RID of a table is held within a contents block.
Clustering
Designed to minimize disk access. Allows for logically related data to be stored in a way where rows from different tables are held in the same block. Remember tables are an abstract, they don't exist in reality.
An example of clustering may be the positioning of student table data in the same block as the counsellor for those students.
Databases can only clustered for one scheme at a time. A DBMS should allow the clustering to change for different schemes.
Comments, suggestions, ideas to
Stuart Banner
