Logical and Physical Data Models

Category: Data, Physical Activity
Last Updated: 20 Jun 2022
Pages: 16 Views: 1017
Table of contents

The Physical Data Model (PDM) describes how the information represented in the Logical Data Model is actually implemented, how the information-exchange requirements are implemented, and how the data entities and their relationships are maintained. There should be a mapping from a given Logical Data Model to the Physical Data Model if both models are used. The form of the Physical Data Model can vary greatly. For some purposes, an additional entity-relationship style diagram will be sufficient.

The Data Definition Language (DDL) may also be used. References to message format standards (which identify message types and options to be used) may suffice for message-oriented implementations. (Getting information from the LDM in form of file) Descriptions of file formats may be used when file passing is the mode used to exchange information. Interoperating systems may use a variety of techniques to exchange data, and thus have several distinct partitions in their Physical Data Model with each partition using a different form.

The figure illustrates some options for expressing the Physical Data Model and an other table (in the original document) provides a listing of the types of information to be captured. A physical data model (or database design) is a representation of a data design which takes into account the facilities and constraints of a given database management system. In the lifecycle of a project it typically derives from a logical data model, though it may be reverse-engineered from a given database implementation.

Order custom essay Logical and Physical Data Models with free plagiarism report

feat icon 450+ experts on 30 subjects feat icon Starting from 3 hours delivery
Get Essay Help

A complete physical data model will include all the database artifacts required to create relationships between tables or to achieve performance goals, such as indexes, constraint definitions, linking tables, partitioned tables or clusters. Analysts can usually use a physical data model to calculate storage estimates; it may include specific storage allocation details for a given database system. As of 2012 seven main databases dominate the commercial marketplace: Informix, Oracle, Postgres, SQL Server, Sybase, DB2 and MySQL.

Other RDBMS systems tend either to be legacy databases or used within academia such as universities or further education colleges. Physical data models for each implementation would differ significantly, not least due to underlying operating-system requirements that may sit underneath them. For example: SQL Server runs only on Microsoft Windows operating-systems, while Oracle and MySQL can run on Solaris, Linux and other UNIX-based operating-systems as well as on Windows.

This means that the disk requirements, security requirements and many other aspects of a physical data model will be influenced by the RDBMS that a database administrator (or an organization) chooses to use. Overview Logical data models represent the abstract structure of a domain of information. They are often diagrammatic in nature and are most typically used in business processes that seek to capture things of importance to an organization and how they relate to one another. Once validated and approved, the logical data model can become the basis of a physical data model and inform the design of a database.

Logical data models should be based on the structures identified in a preceding conceptual data model, since this describes the semantics of the information context, which the logical model should also reflect. Even so, since the logical data model anticipates implementation on a specific computing system, the content of the logical data model is adjusted to achieve certain efficiencies. The term 'Logical Data Model' is sometimes used as a synonym of 'Domain Model' or as an alternative to the domain model.

While the two concepts are closely related, and have overlapping goals, a domain model is more focused on capturing the concepts in the problem domain rather than the structure of the data associated with that domain. History The ANSI/SPARC three level architecture, which "shows that a data model can be an external model (or view), a conceptual model, or a physical model. This is not the only way to look at data models, but it is a useful way, particularly when comparing models". [1] When ANSI first laid out the idea of a logical schema in 1975,[2] the choices were hierarchical and network.

The relational model – where data is described in terms of tables and columns – had just been recognized as a data organization theory but no software existed to support that approach. Since that time, an object-oriented approach to data modelling – where data is described in terms of classes, attributes, and associations – has also been introduced.

Logical data model topics

  • Reasons for building a logical data model
  • Helps common understanding of business data elements and requirements
  • Provides foundation for designing a database
  • Facilitates avoidance of data redundancy and thus prevent data & business transaction inconsistency
  • Facilitates data re-use and sharing
  • Decreases development and maintenance time and cost
  • Confirms a logical process model and helps impact analysis. Modeling benefits
  • Facilitates business process improvement
  • Focuses on requirements independent of technology
  • Facilitates data re-use and sharing
  • Increases return on investment
  • Centralizes metadata
  • Fosters seamless communication between applications
  • Focuses communication for data analysis and project team members
  • Establishes a consistent naming scheme

Logical & Physical Data Model

A logical data model is sometimes incorrectly called a physical data model, which is not what the ANSI people had in mind. The physical design of a database involves deep use of particular database management technology. For example, a table/column design could be implemented on a collection of computers, located in different parts of the world. That is the domain of the physical model. Logical and physical data models are very different in their objectives, goals and content. Features of a logical data model include:

  • Includes all entities and relationships among them. All attributes for each entity are specified.
  • The primary key for each entity is specified.
  • Foreign keys (keys identifying the relationship between different entities) are specified.
  • Normalization occurs at this level.

The steps for designing the logical data model are as follows:

  • Specify primary keys for all entities.
  • Find the relationships between different entities.
  • Find all attributes for each entity.
  • Resolve many-to-many relationships.
  • Normalization.

The figure below is an example of a logical data model.

Logical Data Model Comparing the logical data model shown above with the conceptual data model diagram, we see the main differences between the two:

  • In a logical data model, primary keys are present, whereas in a conceptual data model, no primary key is present.
  • In a logical data model, all attributes are specified within an entity. No attributes are specified in a conceptual data model.

Relationships between entities are specified using primary keys and foreign keys in a logical data model.

In a conceptual data model, the relationships are simply stated, not specified, so we simply know that two entities are related, but we do not specify what attributes are used for this relationship. Logical Model Design Physical Model Design Figure 5. A logical data model (Information Engineering notation). You also need to identify the cardinality and optionality of a relationship (the UML combines the concepts of optionality and cardinality into the single concept of multiplicity). Cardinality represents the concept of “how many” whereas optionality represents the concept of “whether you must have something. For example, it is not enough to know that customers place orders. How many orders can a customer place? None, one, or several? Furthermore, relationships are two-way streets: not only do customers place orders, but orders are placed by customers. This leads to questions like: how many customers can be enrolled in any given order and is it possible to have an order with no customer involved? Figure 5 shows that customers place zero or more orders and that any given order is placed by one customer and one customer only.

It also shows that a customer lives at one or more addresses and that any given address has zero or more customers living at it. Although the UML distinguishes between different types of relationships – associations, inheritance, aggregation, composition, and dependency – data modelers often aren’t as concerned with this issue as much as object modelers are. Subtyping, one application of inheritance, is often found in data models, an example of which is the is a relationship between Item and it’s two “sub entities” Service and Product.

Aggregation and composition are much less common and typically must be implied from the data model, as you see with the part of role that Line Item takes with Order. UML dependencies are typically a software construct and therefore wouldn’t appear on a data model, unless of course it was a very highly detailed physical model that showed how views, triggers, or stored procedures depended on Logical Data Models (LDMs) represent data table (Entity Type) relationships.

Logical Data Model Notations Entity Type

Entity Type refers to a group of related data placed in an RDBMS (Relational Database Management Systems) table.

An entity is an instance of an entity type represented as a single row in a data table. Relationships and Multiplicity Relationships illustrate how two entity types are related. Cardinality specifies how many instances of an entity relate to one instance of another entity. Physical data model represents how the model will be built in the database. A physical database model shows all table structures, including column name, column data type, column constraints, primary key, foreign key, and relationships between tables. Features of a physical data model include:

  • Specification all tables and columns. Foreign keys are used to identify relationships between tables.
  • Denormalization may occur based on user requirements.
  • Physical considerations may cause the physical data model to be quite different from the logical data model.
  • Physical data model will be different for different RDBMS. For example, data type for a column may be different between MySQL and SQL Server.

Steps For Physical Data Model

  • Convert entities into tables.
  • Convert relationships into foreign keys.
  • Convert attributes into columns.
  • Modify the physical data model based on physical constraints / requirements.

Physical v/s logical Entity names are now table names.

  • Attributes are now column names.
  • Data type for each column is specified.

Data types can be different depending on the actual database being used. Data modeling is the act of exploring data-oriented structures. Like other modeling artifacts data models can be used for a variety of purposes, from high-level conceptual models to physical data models (PDMs). Physical data modeling is conceptually similar to design class modeling, the goal being to design the internal schema of a database, depicting the data tables, the data columns of those tables, and the relationships between the tables. Presents a partial PDM for the university – you know that it isn’t complete by the fact that the Seminar table includes foreign keys to tables that aren’t shown, and quite frankly it’s obvious that many domain concepts such as course and professor are clearly not modeled. All but one of the boxes represent tables, the one exception is UniversityDB which lists the stored procedures implemented within the database. Because the diagram is given the stereotype Physical Data Model you know that the class boxes represent tables, without the diagram stereotype I would have needed to use the stereotype Table on each table.

Relationships between tables are modeled using standard UML notation, although not shown in the example it would be reasonable to model composition and inheritance relationships between tables. Relationships are implemented via the use of keys (more on this below). Figure 1. A partial PDM for the university. When you are physical data modeling the following tasks are performed in an iterative manner: Identify tables. Tables are the database equivalent of classes; data is stored in physical tables. As you can see in Figure 1 the university has a Student table to store student data, a Course table to store course data, and so on.

Figure 1 uses a UML-based notation (this is a publicly defined profile which anyone can provide input into). If you have a class model in place a good start is to do a one-to-one mapping of your classes to data tables, an approach that works well in “greenfield” environments where you have the luxury of designing your database schema from scratch. Because this rarely happens in practice you need to be prepared to be constrained by one or more legacy database schemas which you will then need to map your classes to.

In these situations it is unlikely that you will need to do much data modeling, you will simply need to learn to live with the existing data sources, but you will need to be able to read and understand existing models. In some cases you may need to perform legacy data analysis and model the existing schema before you can start working with it. * Normalize tables. Data normalization is a process in which data attributes within a data model are organized to increase the cohesion of tables and to reduce the coupling between tables. The fundamental goal is to ensure that data is stored in one and only one place.

This is an important consideration for application developers because it is incredibly difficult to stores objects in a relational database if a data attribute is stored in several places. The tables in Figure 1 are in third normal form (3NF). * Identify columns. A column is the database equivalent of an attribute, and each table will have one or more columns. For example, the Student table has attributes such as FirstName and StudentNumber. Unlike attributes in classes, which can either be primitive types or other objects, a column may only be a primitive type such as a char (a string), an int (integer), or a float. Identify stored procedures. A stored procedure is conceptually similar to a global method implemented by the database. In Figure 1 you see that stored procedures such as averageMark() and studentsEnrolled() are modeled as operations of the class UniversityDB. These stored procedures implement code that work with data stored in the database, in this case they calculate the average mark of a student and count the number of students enrolled in a given seminar respectively.

Although some of these stored procedures clearly act on data contained in a single table they are not modeled as part of the table (along the lines of methods being part of classes). Instead, because stored procedures are a part of the overall database and not a single table, they are modeled as part of a class with the name of the database. * Apply naming conventions. Your organization should have standards and guidelines applicable to data modeling, and if not you should lobby to have some put in place.

As always, you should follow AM’s practice of Apply Modeling Standards.

  • Identify relationships. There are relationships between tables just like there are relationships between classes. The advice presented relationships in UML class diagrams applies.
  • Apply data model patterns. Some data modelers will apply common data model patterns, David Hay’s (1996) book Data Model Patterns is the best reference on the subject. Data model patterns are conceptually closest to analysis patterns because they describe solutions to common domain issues. Hay’s book is a very good reference for anyone involved in analysis-level modeling, even when you’re taking an object approach instead of a data approach because his patterns model business structures from a wide variety of business domains.
  • Assign keys. A key is one or more data attributes that uniquely identify a row in a table. A key that is two or more attributes is called a composite key. A primary key is the preferred key for an entity type whereas an alternate key (also known as a secondary key) is an alternative way to access rows within a table.

In a physical database a key would be formed of one or more table columns whose value(s) uniquely identifies a row within a relational table. Primary keys are indicated using the stereotype and foreign keys via. Read here for more about keys. Although similar notation is used it is interesting to note the differences between the PDM of Figure 21 and the UML class diagram from which is ti based:

  • Keys. Where it is common practice to not model scaffolding properties on class models it is common to model keys (the data equivalent of scaffolding).
  • Visibility. Visibility isn’t modeled for columns because they’re all public.

However, because most databases support access control rights you may want to model them using UML constraints, UML notes, or as business rules. Similarly stored procedures are also public so they aren’t modeled either. 3. No many-to-many associations. Relational databases are unable to natively support many-to-many associations, unlike objects, and as a result you need to resolve them via the addition of an associative table. The closest thing to an associative table in is WaitList which was introduced to resolve the on waiting list many-to-many association depicted in the class diagram.

A pure associative table is comprised of the primary key columns of the two tables which it maintains the relationship between, in this case StudentNumber from Student and SeminarOID from Seminar. Notice how in WaitList these columns have both a PK and an FK stereotype because they make up the primary key of WaitList while at the same time are foreign keys to the other two tables. WaitList isn’t truly an associative table because it contains non-key columns, in this case the Added column which is used to ensure that the first people on the waiting list are the ones that are given the opportunity to enroll if a seat becomes available.

Had WaitList been a pure associative table I would have applied the associative table stereotype to it.

Logical Versus Physical Database Modeling

After all business requirements have been gathered for a proposed database, they must be modeled. Models are created to visually represent the proposed database so that business requirements can easily be associated with database objects to ensure that all requirements have been completely and accurately gathered.

Different types of diagrams are typically produced to illustrate the business processes, rules, entities, and organizational units that have been identified. These diagrams often include entity relationship diagrams, process flow diagrams, and server model diagrams. An entity relationship diagram (ERD) represents the entities, or groups of information, and their relationships maintained for a business. Process flow diagrams represent business processes and the flow of data between different processes and entities that have been defined.

Server model diagrams represent a detailed picture of the database as being transformed from the business model into a relational database with tables, columns, and constraints. Basically, data modeling serves as a link between business needs and system requirements. Two types of data modeling are as follows:

  • Logical modeling
  • Physical modeling If you are going to be working with databases, then it is important to understand the difference between logical and physical modeling, and how they relate to one another.
  • Logical and physical modeling are described in more detail in the following subsections.

Logical Modeling

Logical modeling deals with gathering business requirements and converting those requirements into a model. The logical model revolves around the needs of the business, not the database, although the needs of the business are used to establish the needs of the database. Logical modeling involves gathering information about business processes, business entities (categories of data), and organizational units.

After this information is gathered, diagrams and reports are produced including entity relationship diagrams, business process diagrams, and eventually process flow diagrams. The diagrams produced should show the processes and data that exists, as well as the relationships between business processes and data. Logical modeling should accurately render a visual representation of the activities and data relevant to a particular business. Note| Logical modeling affects not only the direction of database design, but also indirectly affects the performance and administration of an implemented database.

When time is invested performing logical modeling, more options become available for planning the design of the physical database. | The diagrams and documentation generated during logical modeling is used to determine whether the requirements of the business have been completely gathered. Management, developers, and end users alike review these diagrams and documentation to determine if more work is required before physical modeling commences. Typical deliverables of logical modeling include Entity relationship diagrams. An Entity Relationship Diagram is also referred to as an analysis ERD.

The point of the initial ERD is to provide the development team with a picture of the different categories of data for the business, as well as how these categories of data are related to one another.

  • Business process diagrams. The process model illustrates all the parent and child processes that are performed by individuals within a company. The process model gives the development team an idea of how data moves within the organization. Because process models illustrate the activities of individuals in the company, the process model can be used to determine how a database application interface is design.
  • User feedback documentation

Physical Modeling Physical modeling involves the actual design of a database according to the requirements that were established during logical modeling. Logical modeling mainly involves gathering the requirements of the business, with the latter part of logical modeling directed toward the goals and requirements of the database. Physical modeling deals with the conversion of the logical, or business model, into a relational database model. When physical modeling occurs, objects are being defined at the schema level. A schema is a group of related objects in a database. A database design effort is normally associated with one schema.

During physical modeling, objects such as tables and columns are created based on entities and attributes that were defined during logical modeling. Constraints are also defined, including primary keys, foreign keys, other unique keys, and check constraints. Views can be created from database tables to summarize data or to simply provide the user with another perspective of certain data. Other objects such as indexes and snapshots can also be defined during physical modeling. Physical modeling is when all the pieces come together to complete the process of defining a database for a business.

Physical modeling is database software specific, meaning that the objects defined during physical modeling can vary depending on the relational database software being used. For example, most relational database systems have variations with the way data types are represented and the way data is stored, although basic data types are conceptually the same among different implementations. Additionally, some database systems have objects that are not available in other database systems. Implementation of the Physical Model| The implementation of the physical model is dependent on the hardware and software being used by the company.

The hardware can determine what type of software can be used because software is normally developed according to common hardware and operating system platforms. Some database software might only be available for Windows NT systems, whereas other software products such as Oracle are available on a wider range of operating system platforms, such as UNIX. The available hardware is also important during the implementation of the physical model because data is physically distributed onto one or more physical disk drives. Normally, the more physical drives available, the better the performance of the database after the implementation.

Some software products now are Java-based and can run on virtually any platform. Typically, the decisions to use particular hardware, operating system platforms, and database software are made in conjunction with one another. A logical data model describes your model entities and how they relate to each other. A physical data model describes each entity in detail, including information about how you would implement the model using a particular (database) product. In a logical model describing a person in a family tree, each person node would have attributes such as name(s), date of birth, place of birth, etc.

The logical diagram would also show some kind of unique attribute or combination of attributes called a primary key that describes exactly one entry (a row in SQL) within this entity. The physical model for the person would contain implementation details. These details are things like data types, indexes, constraints, etc. The logical and physical model serve two different, but related purposes. A logical model is a way to draw your mental roadmap from a problem specification to an entity-based storage system.

The user (problem owner) must understand and approve the logical model. A physical model is the roadmap from the logical model to the hardware. The developer (software owner) must understand and use the physical model. ERD Consider a hospital: Patients are treated in a single ward by the doctors assigned to them. Usually each patient will be assigned a single doctor, but in rare cases they will have two. Heathcare assistants also attend to the patients, a number of these are associated with each ward. Initially the system will be concerned solely with drug treatment.

Each patient is required to take a variety of drugs a certain number of times per day and for varying lengths of time. The system must record details concerning patient treatment and staff payment. Some staff are paid part time and doctors and care assistants work varying amounts of overtime at varying rates (subject to grade). The system will also need to track what treatments are required for which patients and when and it should be capable of calculating the cost of treatment per week for each patient (though it is currently unclear to what use this information will be put).

Cite this Page

Logical and Physical Data Models. (2017, Apr 06). Retrieved from https://phdessay.com/logical-and-physical-data-models/

Don't let plagiarism ruin your grade

Run a free check or have your essay done for you

plagiarism ruin image

We use cookies to give you the best experience possible. By continuing we’ll assume you’re on board with our cookie policy

Save time and let our verified experts help you.

Hire writer