Normalizing with Entity Relationship Diagramming | index-art.info
A Entity Relationship Diagram showing 3NF. You can edit this Entity Relationship Diagram using Creately diagramming tool and Creately diagrams can be exported and added to Word, PPT (powerpoint), Excel, Visio or any other document. 1NF 2NF 3NF BCNF ER Diagram Decomposition Diagram Normalization is the process to normalize (split) a large relation (entity/table) into smaller Solutions Manual Computer Organization And Architecture Designing For Performance. The entity relationship (ER) data model has existed for over 35 years. If we refer back to our COMPANY database, examples of an independent entity include.
Every piece of information you see here is important. How can we capture this information in a database? Invoice Those of us who have an ordered mind but aren't quite aware of relational databases might decide to use a spreadsheet, such as Microsoft Excel. But what if you start to ask complicated questions, such as: What are total sales of 56" Blue Freens in the state of Texas? What items were sold on July 14, ? As your collection of invoices grows it becomes increasingly difficult to ask the spreadsheet these questions.
In an attempt to put the data into a state where we can reasonably expect to answer such questions, we begin the normalization process. These represent all the data we have for a single invoice Invoice Never mind the fact that one database row is made up here of three spreadsheet rows: It's an unfortunate ambiguity of language. Academic database theoreticians have a special word that helps a bit with the ambiguity: We're not going to use that word here and if you're lucky, you'll never hear it again for the rest of your life.
Here, we will refer to this thing as a row. Again we turn our attention to the first invoice in Figure A This is a column within our first database row. You will notice that each of these columns contains a list of values. It is precisely these lists that NF1 objects to: NF1 abhors lists or arrays within a single database column.
Therefore it is clear that we have to do something about the repeating item information data within the row for Invoice On Figure A-1, that is the following cells: We can satisfy NF1's need for atomicity quite simply: We were trying to reduce the amount of duplication, and here we have introduced more!
Just look at all that duplicated customer data! The kind of duplication that we introduce at this stage will be addressed when we get to the Third Normal Form.
We have actually only told half the story of NF1. Strictly speaking, NF1 addresses two issues: A row of data cannot contain repeating groups of similar data atomicity ; and Each row of data must have a unique identifier or Primary Key.
We have already dealt with atomicity. But to make the point about Primary Keys, we shall bid farewell to the spreadsheet and move our data into a relational database management system RDBMS. A primary key is a column or group of columns that uniquely identifies each row. As you can see from Figure B, there is no single column that uniquely identifies each row. However, if we put a number of columns together, we can satisfy this requirement. Therefore, together they qualify to be used as the table's primary key.
Even though they are in two different table columns, they are treated as a single thing. We call them concatenated. A value that uniquely identifies a row is called a primary key.
When this value is made up of two or more columns, it is referred to as a concatenated primary key. We identify the columns that make up the primary key with the PK notation. Our database schema now satisfies the two requirements of First Normal Form: Thus it fulfills the most basic criterion of a relational database. No Partial Dependencies on a Concatenated Key Next we test each table for partial dependencies on a concatenated key.
This means that for a table that has a concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column only depends upon one part of the concatenated key, then we say that the entire table has failed Second Normal Form and we must create another table to rectify the failure. To try and understand this, let's take apart the orders table column by column. For each column we will ask the question, Can this column exist without one or the other part of the concatenated primary key?
First, recall the meaning of the two columns in the primary key: We don't analyze these columns since they are part of the primary key. Now consider the remaining columns The short answer is yes: Some of you might object, thinking that this means you could have a dated order with no items an empty invoice, in effect.
But this is not what we are saying at all: All we are trying to establish here is whether a particular order on a particular date relies on a particular item.
Clearly, it does not. The problem of how to prevent empty orders falls under a discussion of "business rules" and could be resolved using check constraints or application logic; it is not an issue for Normalization to solve. But let's continue with testing the other columns. We have to find all the columns that fail the test, and then we do something special with them.
What do we do with these columns? A compound attribute contains multiple kinds of data. Expand entity types into two entity types and a relationship. This transformation can be useful to record a finer level of detail about an entity. Transform a weak entity type into a strong entity type.
This transformation is most useful for associative entity types. Add historical details to a data model. Historical details may be necessary for legal as well as strategic reporting requirements. This transformation can be applied to attributes and relationships.
Add generalization hierarchies by transforming entity types into generalization hierarchy. Application of normalization principles toward ERD development enhances these guidelines. To understand this application i representation of dependency concepts in an ERD is outlined, followed by ii representation of normal forms toward the development of entity type structure.
Guidelines for identification of various dependencies is avoided in the paper so as to focus more on their application. Only the first four normal forms and the Boyce-Codd normal forms are considered. Each entity instance represents a set of values taken by the non entity identifier attributes for each primary key entity identifier value.
So, in a way an entity instance structure also reflects an application of the functional dependency concept.Entity Relationship Diagram (ERD) Tutorial - Part 1
Name, Street, City, Zip. Figure 2 Each entity instance will now represent the functional dependency among the entity attributes as shown in Figure 3. Figure 3 During requirement analysis, some entity types may be identified through functional dependencies, while others may be determined through database relationships.
Another important consideration is to distinguish when one attribute alone is the entity identifier versus a composite entity identifier. A composite entity identifier is an entity identifier with more than one attribute.
Normalizing with Entity Relationship Diagramming
A functional dependency in which the determinant contains more than one attribute usually represents a many-to-many relationship, which is more addressed through higher normal forms. The notion of having a composite entity identifier is not very common, and often times is a matter of expediency, rather than good entity structure or design. Transitive dependency in an entity type occurs if non entity identifier attributes have dependency among themselves.
For example, consider the modified Student entity type as shown in Figure 4. Figure 4 In this entity type, suppose there is a functional dependency BuildingName? Fee dependency implies that the value assigned to the Fee attribute is fixed for distinct BuildingName attribute values.
In other words, the Fee attribute values are not specific to the SID value of a student, but rather the BuildingName value. The entity instance of transitive dependency is shown in Figure 5. Figure 5 Multi-valued dependency equivalency in ERD occurs when attributes within an entity instance have more than one value.
This is a situation when some attributes within an entity instance have maximum cardinality of N more than 1. When an attribute has multiple values in an entity instance, it can be setup either as a composite key identifier of the entity type, or split into a weak entity type.
For example, consider the following entity type Student Details as shown in Figure 6. The composition of entity identifier is due to the fact that a student has multiple MajorMinor values along with being involved in multiple activities.
1NF 2NF 3NF BCNF ER Diagram Decomposition Diagram - index-art.info
The multi-valued dependency affects the key structure. This means that a SID value is associated with multiple values of MajorMinor and Activity attributes, and together they determine other attributes. The entity instance of Student Details entity type is shown Figure 7.
Each normal form rule and its application is outlined. First Normal Form 1NF The first normal form rule is that there should be no nesting or repeating groups in a table. Now an entity type that contains only one value for an attribute in an entity instance ensures the application of first normal form for the entity type.
So in a way any entity type with an entity identifier is by default in first normal form. For example, the entity type Student in Figure 2 is in first normal form. Second Normal Form 2NF The second normal form rule is that the key attributes determine all non-key attributes. A violation of second normal form occurs when there is a composite key, and part of the key determines some non-key attributes.
The second normal form deals with the situation when the entity identifier contains two or more attributes, and the non-key attribute depends on part of the entity identifier. For example, consider the modified entity type Student as shown in Figure 8.
The entity type has a composite entity identifier of SID and City attributes. Figure 8 An entity instance of this entity type is shown in Figure 9. Now, if there is a functional dependency City? Status, then the entity type structure will violate the second normal form.
Figure 9 To resolve the violation of the second normal form a separate entity type City with one-to-many relationship is created as shown in Figure The relationship cardinalities can be further modified to reflect organizational working. In general, the second normal form violation can be avoided by ensuring that there is only one attribute as an entity identifier.
This normal form is violated when there exists a dependency among non-key attributes in the form of a transitive dependency. For example consider the entity type Student as shown in Figure 4. In this entity type, there is a functional dependency BuildingName? Fee that violates the third normal form.
Transitive dependency is resolved by moving the dependency attributes to a new entity type with one-to-many relationship. In the new entity type the determinant of the dependency becomes the entity identifier. The resolution of the third normal form is shown in Figure The Boyce-Codd normal form rule is that every determinant is a candidate key.
Even though Boyce-Codd normal form and third normal form generally produce the same result, Boyce-Codd normal form is a stronger definition than third normal form. Every table in Boyce-Codd normal form is by definition in third normal form. Boyce-Codd normal form considers two special cases not covered by third normal form: