1 The Database Design Process
2 STEP #1: Collection and analysis of requirements
3 STEP #2: Conceptual design of the database with the ER model
4 STEP #3: Logical design
5 STEP #4: Physical design
1 The Database Design Process
- the first step of the database design process is collection and analysis of requirements
- user requirements are written to specify data requirements and functional requirements
- functional requirements consist of operations or transactions applied to the database
- software engineer represents functional requirements with data flow diagrams, sequence diagrams and scenarios
- the second step is the conceptual design of the database
- database designer creates a conceptual schema with a high-level conceptual data model
- the conceptual schema describes the data requirements
- the conceptual schema includes entity types, relationships and constraints
- the conceptual schema is easy to comunicate to end users
- note: the conceptual schema may be used as a summary of the users' data requirements
- functional requirements that form the core data model operations are used to specify the user queries
- this proves that the conceptual schema meets the functional requirements
- the next step is called logical design or data model mapping
- here the database is implemented with a DBMS
- commercial DBMS use an implementation data model, such as the relational data model
- the database schema is transformed from the high-level data model into the implementation data model
- the logical design phase produces a database schema expressed in terms of the implementation data model of the DBMS
- the last step is the physical design phase
- here you specify the internal storage structures, indexes ecc. for the database files
2. STEP #1: Collection and analysis of requirements
The COMPANY application is a database application that keeps track of company's employees, departments and projects
Suppose that the database designers, after the analysis phase, write the following description of the miniworld: the data requirements of the application.
- the company consists of several departments
- department
- a department has a name and a number
- a department controls a number of projects
- a departement may have one or more locations
- a department is managed by one employee
- the database must keep track of the start date when the employee begins to manage the department
- project
- a project has a name, a number and a single location
- employee
- an employee has a name, social security number, address, email, salary, sex and birth date
- each employee is assigned to a department
- an employee may work on multiple projects controlled by different department
- the database keeps track of current number of hours per week the employee works for each project
- the database keeps track of the supervisor of each employee, which is another employee
- the database must keep track of the dependents of each employee: dependent's name, sex, birth_date and relationship to the employee
3. STEP #2 Conceptual design of the database with the ER model
ER models describe data using terms such as entities, relationships and attributes.
3.1 Initial Conceptual Design of the COMPANY Database
Here, the entities of the COMPANY database are illustrated, then the design of the database is refined introducting relationships
Four entity types for the COMPANY database are identified:
- DEPARTMENT (Name, Number, Locations, Manager, ManagerStartDate)
- name, number and manager are single valued attributes, locations is a multivalued attribute
- you can specify Name or Number as key attribute
- PROJECT (Name, Number, Location, ControllingDepartment)
- both Name or Number are separate key attributes
- EMPLOYEE (SSN, Name(FName,Middle,LName), Sex, Address, Salary, BirthDate, Department, Supervisor, WorkOn(Project, Hours))
- you may model the name and address as composite attribute, if the user refer to individual components of the attributes. In this case, only Name is modelled as composite attribute, because database users refer to its component elements
- the database keeps track of the numbers of hours an employee works on each project by using the EMPLOYEE's WorkOn multivalued composite attribute with single components (Project, Hours)
- DEPENDENT (Employee, Name, Sex, BirthDate, Relationship)
3.2 Introducing relationships into the ER design
In the preliminary design, there is a relationship beween two entity types, when an attribute of an entity type refer to another entity type:
- for example, the attribute Department of Employee refers to the department for which the employee works
- in the final version of the ER model, the schema represent the above references not as attributes but as relationships
- use cardinality ratio and partecipation constraints for each relationship types expressed in the requirements
The following relationship types are specified:
- MANAGES as 1:1 relationship type between EMPLOYEE and DEPARTMENT
- EMPLOYEE −−−1−−− manages ===1=== DEPARTMENT
- EMPLOYEE partecipation is partial
- DEPARTMENT partecipation is total
- WORKS_FOR as 1:N relationship type between DEPARTMENT and EMPLOYEE
- EMPLOYEE ===N=== works for ===1=== DEPARTMENT
- both partecipations are total
- WORKS_ON as M:N relationship type between EMPLOYEE and PROJECT, with attribute Hours
- EMPLOYEE ===N=== works on ===M=== PROJECT
- both partecipations are total
- CONTROLS as 1:N relationship type between DEPARTMENT and PROJECT
- DEPARTMENT −−−1−−− controls ===N=== PROJECT
- PROJECT partecipation is total
- DEPARTMENT partecipation is partial
- SUPERVISION as 1:N relationship type between an EMPLOYEE instance with supervisor role and an EMPLOYEE instance with supersee role
- EMPLOYEE supervisor_role −−−1−−− supervises −−−N−−− EMPLOYEE supersee_role
- both partecipations are partial
- DEPENDENTS_OF as 1:N relationship type between EMPLOYEE and DEPENDENT
- the relationship is the signifying relationship for the weak entity type DEPENDENT
- the partecipation of EMPLOYEE is partial
- the partecipation of DEPENDENT is total
After speficifying the six relationship type, the attributes that are converted into relationship types are removed from the entity types.
The DEPENDENT weak entity type
- the DEPENDENT entity is a weak entity type: it does not have a key attribute
- the DEPENDENT entity has attributes (Name, BirthDate, Sex, Relationship)
- the DEPENDENT weak entity type is used to keep track of the dependents of each employee through the 1:N DEPENDENT_OF relationship
The DEPENDENT_OF relationship
- the EMPLOYEE entity type is the owner entity type
- the DEPENDENT_OF relationship type is the identifying relationship of the DEPENDENT entity type
- note that a weak entity type always has a total partecipation constraint with respect to its identifying relationship type
The Name partial key
- a partial key is the attribute that uniquely identifies weak entities that are related to the same owner entity
- the DEPENDENT entity type is identified by being related to the EMPLOYEE entity type through the DEPENDENT's Name attribute:
- the Name attribute is the partial key of the DEPENDENT weak entity: there are no two DEPENDENT entities with the same Name related to the same employee
3.3 Alternative Notations for ER Diagrams
3.3.1 The ER diagram with (min,max) notation
An alternative ER notation associates a pair of integer numbers (min, max) with a partecipating entity type E in a relationship type R:
- (min, max) numbers notation specifies structural constraints on relationships
- the numbers mean that each entity e in the entity type, must partecipate in at least min and at most max relationship instances in R at any point in time
- where 0 ≤ min ≤ max and max ≥ 1
- where min = 0 implies partial partecipation and min > 0 implies total partecipation
- note that the (min,max) number notation reverses the values:
EMPLOYEE === N === works for === 1 === DEPARTMENT // an employee works for one dept.
EMPLOYEE −−−(1,1)−−− works for −−−(1,N)−−− DEPARTMENT // an employee instance partecipates at least one and at most one in the work_for relationship
3.3.2 The UML class diagram
The UML modeling language class diagrams can be considered an alternative notation to ER diagrams
- you can show the COMPANY database ER schema as an UML class diagram
- the ER entity types are modelled as UML classes and the ER entities are modelled as UML objects
UML Classes
- In UML class diagrams, a class is displayed as a box that contains three sections: the top section contains the class name, the middle section holds attributes and the bottom section includes operations.
- the EMPLOYEE class has the attribute: Name, SSN, BirtDate, Sex, Address and Salary
- the domain or data type of the attributes may be optionally specified after the attribute, by placing a : and the data type name
- multivalued attributes are modelled as separate classes, as showed with LOCATION class
Modelling ER relationship types in UML
- ER relationship types are called association and relationship instances are called links, in UML terminology
- a binary association is represented as a line that connects the two partecipating classes and may optionally have a name
- relationship attributes are placed in association classes connected to the association's line by a dashed line.
- the (min, max) notation is used in ER diagram to describe relationship constraints,
- ER relationship constraints are named multiplicities in UML
- the ER (min, max) notation is replaced by the min..max notation in UML and * indicated no maximum limit on partecipation
- the min..max numbers are placed on the other ends of the line compared with the (min, max) numbers notation
- ER (1, 1) notation is translated into 1 in UML, and ER 0..* notation is translated into * in UML
- a ER recursive relationship is an UML reflexive association
In UML there are different types of relationships between objects:
- dependency exists if an object locally uses other objects as by parameter or return type.
- association exists if an object systematically uses other objects as in class attributes
- aggregation indicates the part-of relationship between the whole and its parts
- composition indicates the the part-of relationship, where the parts cannot exist independently from its owner
In the company database the locations of a department and the single location of a project are modeled as aggregations
UML distinguishes between unidirectional and bidirectional relationships:
- unidirectional relationships have a line, with an arrow, connecting the two classes
- there is only a direction to access the related object
- bidirectional relationships have a line, without arrow, connecting the two classes
- there are two directions to access, these associations are navigable from both ends
- bidirection associations are default
UML relationships names and attributes
- relationship names are optional in UML
- relationship attributes are displayed in a class attached to line representing the association
Operations names shown in each class are derived from functional requirements of the applications
- during the design phase, the software developer complete the description of the operations adding argument types and the definition of the function the operation should implement
Weak entities are modeled with a qualified association construct in UML
- the partial key is placed in a box attached to the owner class as illustrated by the DEPENDENT class and its qualified aggregation to EMPLOYEE
- in UML the partial key attribute Dependent_name is called the discriminator
- qualified association in UML can be used to model other situations than weak entities
4. STEP #3: Logical design, mapping the conceptual model to the relational model
In the relational model, a database is seen as a collection of relations. Informally, each relation is similar to a table of values or a flat file of records.
A Relation Schema R, denoted as R(A1, A2, ..., An), is made up of a relation name R and a list of attributes A1, A2, ..., An
A Relational database schema S is a set of relation schemas S = {R1, R2,... , Rm} and a set of integrity constraints IC
The COMPANY's relational database schema is the following set of relation schemas:
COMPANY = {EMPLOYEE, DEPARTMENT, DEPT_LOCATIONS, PROJECT, WORKS_ON, DEPENDENT}
Note: attributes' name
- attributes that represent the same concept may or may not have identical names in different relations
- on the contrary, attributes that represent different concepts may have the same name in different relations
Referential integrity constraints on the Company database
- you specify a referential integrity constraint between two relations to maintain the consistency among tuples: a tuple in the former relation that refers to another relation must refer to an existing tuple in latter relation
- for example, in the COMPANY database schema, the attribute DepNo of EMPLOYEE is the number of the department where each employee works. Hence, a value of DepNo in any tuple t1 of the EMPLOYEE relation either occurs as a value of DNumber, primary key of DEPARTMENT, in a tuple t2 of the DEPARTMENT relation or is NULL, if the employee has not been assigned to a department yet.
5. STEP #4: Physical design
Using a CASE tool, you can design the phisical schema of the COMPANY database.
Crow's foot notation
The Chen-style notations have symbols for entity types and relationship types, these notations are best suited for the logical design; the crow's foot notation is useful for the physical design, where database structure is described in terms of tables and relationships or foreign keys constraints. Crow's foot diagrams represent tables as rectangular boxes and relationships as lines between the boxes. Rectangular boxes represent tables, after the ER model's relationships have been transformed into table attributes or into a new table.
Constraints on relationship are represented by how lines's end are decorated:
───o┼ cardinality one, optional partecipation (0,1)
───┼┼ cardinality one, total partecipation (1,1)
──o─< cardinality many, optional partecipation (0,N)
───┼< cardinality many, total partecipation (1,N)
From the physical schema, the database CASE tool is able to generate the SQL DDL script for you.

No comments:
Post a Comment