R_Ch06- Data Modelling Advanced Concepts.ppt
Database Principles: Fundamentals of Design, Implementations and Management
Lecture 6 - CHAPTER 6 : Data Modelling
Advanced Concepts
*
Objectives
In this chapter, you will learn:
About the extended entity relationship (EER) model’s main constructs
How entity clusters are used to represent multiple entities and relationships
The characteristics of good primary keys and how to select them
How to use flexible solutions for special data modeling cases
What issues to check for when developing data models based on EER diagrams
*
The Extended Entity Relationship Model
Result of adding more semantic constructs to original entity relationship (ER) model
Diagram using this model is called an EER diagram (EERD)
Entity Supertypes and Subtypes
Entity supertype
Generic entity type that is related to one or more entity subtypes
Contains common characteristics
Entity subtypes
Contains unique characteristics of each entity subtype
*
*
Entity Supertypes and Subtypes (cont..)
*
Specialization Hierarchy
Depicts an arrangement of higher-level entity supertypes and lower-level entity subtypes
Relationships are described in terms of “IS-A” relationships
Subtype exists only within context of supertype
Every subtype has only one supertype to which it is directly related
Can have many levels of supertype/subtype relationships
Figure 6.2 in your book as well
*
Specialization Hierarchy (cont..)
Figure 6.2 in your book as well
Specialization Hierarchy (cont..)
Support attribute inheritance
Define special supertype attribute known as subtype discriminator
Define disjoint/overlapping constraints and complete/partial constraints
*
*
Inheritance
Enables entity subtype to inherit attributes and relationships of supertype
All entity subtypes inherit their primary key attribute from their supertype
At implementation level, supertype and its subtype(s) maintain a 1:1 relationship
Entity subtypes inherit all relationships in which supertype entity participates
Lower-level subtypes inherit all attributes and relationships from all upper level-supertypes
Inheritance (cont..)
*
Inheritance (cont..)
*
*
Natural Keys and Primary Keys
Natural key is a real-world identifier used to uniquely identify real-world objects
Familiar to end users and forms part of their day-to-day business vocabulary
Generally data modeler uses natural identifier as primary key of entity being modeled
May instead use composite primary key or surrogate key
*
Primary Key Guidelines
A Primary key is an attribute or combination of attributes that uniquely identifies entity instances in an entity set
Could also be combination of attributes
Main function is to uniquely identify an entity instance or row within a table
Guarantee entity integrity, not to “describe” the entity
Primary keys and foreign keys implement relationships among entities
Behind the scenes, hidden from user
Primary Key Guidelines (cont..)
*
Primary Key Guidelines (cont..)
*
*
Entity Integrity:
Selecting Primary Keys
Primary key most important characteristic of an entity
Single attribute or some combination of attributes
Primary key’s function is to guarantee entity integrity
Primary keys and foreign keys work together to implement relationships
Properly selecting primary key has direct bearing on efficiency and effectiveness
*
When to Use Composite Primary Keys
Composite primary keys are useful in two cases:
As identifiers of composite entities
Where each primary key combination allowed once in M:N relationship
As identifiers of weak entities
Where weak entity has a strong identifying relationship with the parent entity
Automatically provides benefit of ensuring that there cannot be duplicate values
Figure 6.7 in your book
*
When to Use Composite Primary Keys (cont..)
Figure 6.7 in your book
*
When to Use Composite Primary Keys (cont..)
When used as identifiers of weak entities normally used to represent:
Real-world object that is existent-dependent on another real-world object
Real-world object that is represented in data model as two separate entities in strong identifying relationship
Dependent entity exists only when it is related to parent entity
*
When To Use Surrogate Primary Keys
Especially helpful when there is:
No natural key
Selected candidate key has embedded semantic contents
Selected candidate key is too long or cumbersome
If you use surrogate key
Ensure that candidate key of entity in question performs properly
Use “unique index” and “not null” constraints
When To Use Surrogate Primary Keys (cont..)
*
*
Design Cases:
Learning Flexible Database Design
Data modeling and design requires skills acquired through experience
Experience acquired through practice
Four special design cases that highlight:
Importance of flexible design
Proper identification of primary keys
Placement of foreign keys
*
Design Case #1: Implementing 1:1 Relationships
Foreign keys work with primary keys to properly implement relationships in relational model
Put primary key of the “one” side (parent entity) on the “many” side (dependent entity) as foreign key
Primary key: parent entity
Foreign key: dependent entity
*
Design Case #1: Implementing 1:1 Relationships (cont..)
In 1:1 relationship two options:
Place a foreign key in both entities (not recommended)
Place a foreign key in one of the entities
Primary key of one of the two entities appears as foreign key of other
Design Case #1: Implementing
1:1 Relationships (continued)
*
Figure 6.9 in your book
*
Design Case #1: Implementing
1:1 Relationships (cont..)
Figure 6.9 in your book
*
Design Case #2: Maintaining History of Time-Variant Data
Normally, existing attribute values replaced with new value without regard to previous value
Time-variant data:
Values change over time
Must keep a history of data changes
Keeping history of time-variant data equivalent to having a multivalued attribute in your entity
Must create new entity in 1:M relationships with original entity
New entity contains new value, date of change
Figure 6.10 in your book
*
Design Case #2: Maintaining
History of Time-Variant Data (cont..)
Figure 6.10 in your book
Figure 6.11 in your book
*
Design Case #2: Maintaining
History of Time-Variant Data (cont..)
Figure 6.11 in your book
*
Design Case #3: Fan Traps
Design trap occurs when relationship is improperly or incompletely identified
Represented in a way not consistent with the real world
Most common design trap is known as fan trap
Fan trap occurs when one entity is in two 1:M relationships to other entities
Produces an association among other entities not expressed in the model
Figure 6.12 in your book
*
Design Case #3: Fan Traps (cont..)
Figure 6.12 in your book
*
Design Case #4:
Redundant Relationships
Redundancy is seldom a good thing in database environment
Occur when there are multiple relationship paths between related entities
Main concern is that redundant relationships remain consistent across model
Some designs use redundant relationships to simplify the design
Figure 6.13 in your book
*
Design Case #4:
Redundant Relationships (cont..)
Figure 6.13 in your book
Figure 6.14 in your book.
*
Design Case #4:
Redundant Relationships (cont..)
Figure 6.14 in your book.
*
Data Modeling Checklist
Data modeling translates specific real-world environment into data model
Represents real-world data, users, processes, interactions
EERM (Extented Entity Relationship Model) enables the designer to add more semantic content to the model
Data modeling checklist helps ensure data modeling tasks successfully performed
Based on concepts and tools learned since Chapter 3
Data Modeling Checklist
*
Data Modeling Checklist (cont..)
*
*
Summary
Extended entity relationship (EER) model adds semantics to ER model
Adds semantics via entity supertypes, subtypes, and clusters
Entity supertype is a generic entity type related to one or more entity subtypes
Specialization hierarchy
Depicts arrangement and relationships between entity supertypes and entity subtypes
Inheritance means an entity subtype inherits attributes and relationships of supertype
*
Summary (cont..)
Subtype discriminator determines which entity subtype the supertype occurrence is related to:
Partial or total completeness
Specialization vs. generalization
Entity cluster is “virtual” entity type
Represents multiple entities and relationships in ERD
Formed by combining multiple interrelated entities and relationships into a single object
*
Summary (cont..)
Natural keys are identifiers that exist in real world
Sometimes make good primary keys
Characteristics of primary keys:
Must have unique values
Should be nonintelligent
Must not change over time
Preferably numeric or composed of single attribute
*
Summary (cont..)
Composite keys are useful to represent
M:N relationships
Weak (strong-identifying) entities
Surrogate primary keys are useful when no suitable natural key makes primary key
In a 1:1 relationship, place the PK of mandatory entity
As FK in optional entity
As FK in entity that causes least number of nulls
As FK where the role is played
*
Summary (cont..)
Time-variant data
Data whose values change over time
Requires keeping a history of changes
To maintain history of time-variant data:
Create entity containing the new value, date of change, other time-relevant data
Entity maintains 1:M relationship with entity for which history maintained
*
Summary (cont..)
Fan trap occurs when you have:
One entity in two 1:M relationships to other entities and there is an
Association among the other entities not expressed in model
Redundant relationships occur when multiple relationship paths between related entities
Main concern is that they remain consistent across the model
Data modeling checklist provides way to check that the ERD meets minimum requirements
ADDITIONAL SLIDES
Please find additional slides to have a look at..
*
*
Subtype Discriminator
Attribute in supertype entity
Determines to which entity subtype each supertype occurrence is related
Default comparison condition for subtype discriminator attribute is equality comparison
Subtype discriminator may be based on other comparison condition
*
Disjoint and Overlapping Constraints
Disjoint subtypes
Also known as non-overlapping subtypes
Subtypes that contain unique subset of supertype entity set
Overlapping subtypes
Subtypes that contain nonunique subsets of supertype entity set
Figure 6.4 Same as in your book
*
Disjoint and Overlapping Constraints (cont..)
Figure 6.4 Same as in your book
Disjoint and Overlapping Constraints (cont..)
*
*
Completeness Constraint
Specifies whether entity supertype occurrence must be a member of at least one subtype
Can be partial or total
Partial completeness
Symbolized by a circle over a single line
Some supertype occurrences that are not members of any subtype
Total completeness
Symbolized by a circle over a double line
Every supertype occurrence must be member of at least one subtype
Table 6.2 same as in your book..
*
Completeness Constraint (cont..)
Table 6.2 same as in your book..
*
Specialization and Generalization
Specialization
Identifies more specific entity subtypes from higher-level entity supertype
Top-down process of identifying lower-level, more specific entity subtypes from higher-level entity supertype
Based on grouping unique characteristics and relationships of the subtypes
*
Specialization and Generalization (cont..)
Generalization
Identifies more generic entity supertype from lower-level entity subtypes
Bottom-up process of identifying higher-level, more generic entity supertype from lower-level entity subtypes
Based on grouping common characteristics and relationships of the subtypes
Composition and Aggregation
Aggregation
a larger entity can be composed of smaller entities.
Composition
special case of aggregation
when the parent entity instance is deleted, all child entity instances are automatically deleted.
*
Composition and Aggregation (cont..)
*
Using Aggregation and Composition
An aggregation construct is used when an entity is composed of (or is formed by) a collection of other entities, but the entities are independent of each other.
the relationship can be classified as a ‘has_a’ relationship type.
A composition construct is used when two entities are associated in an aggregation association with a strong identifying relationship.
deleting the parent deletes the children instances.
*
Aggregation and Composition
*
*
Entity Clustering
A “Virtual” entity type is used to represent multiple entities and relationships in ERD
Considered “virtual” or “abstract” because it is not actually an entity in final ERD
Temporary entity used to represent multiple entities and relationships
Eliminate undesirable consequences
Avoid display of attributes when entity clusters are used
Figure 6.6 in your book
*
Entity Clustering (cont..)
Figure 6.6 in your book