04aKeys

Database Management

Mick McQuaid

University of Texas at Austin

02 May 2026

Last time

  • We saw an entity-relationship diagram for the parts database including three entities: supplier, part, and quantity
  • Each entity had a primary key called sid, pid, or qid
  • The quantity entity also had two foreign keys, sid and pid
  • The process of entity-relationship diagramming allows us to identify the objects, events, and processes that will constitute tables, usually one table per entity
  • The primary key uniquely identifies a row in a table, making that row different from every other row in the table or relation
  • Foreign keys are valuable in connecting one table to another table, uniquely identifying a row in another table or relation
  • There is a process for identifying or creating a primary key

Some terms

  • A domain is the set of all possible values for a particular type of attribute, but may be used for more than one attribute, e.g., a person’s name is a domain that may be used for employee name or customer name.
  • A superkey is a set of one or more attributes that, when taken collectively, allows us to identify uniquely an entity or relation.
  • A candidate key is any superkey that can not be reduced to another superkey
  • A primary key is chosen arbitrarily from available candidate keys as the identifier for a tuple in the relation, also known as a row in the table.

Candidate and Primary Key Identification

  • Any unique tuple (also known as a row) is a superkey.
  • Suppose you have a list of people assigned to office numbers, 101–109.
  • Some of the people have identical names but only one person is assigned to each office. Furthermore, each office has exactly one phone with one phone number.
  • The Office relation has any combination of these attributes that includes phone number or office number or both as a superkey.
  • There are only two candidate keys, though: office number alone and phone number alone. Any other superkey may be reduced to one of these and still uniquely identify a row, satisfying the superkey property.
  • Either candidate key may be chosen as the primary key.

An example table

Office Number Person Phone
101 Smith 5-4072
102 O’Brien 5-3898
103 Julia 5-2124
104 Smith 5-9823

Notice that each row is unique but also each office number is unique and each phone is unique. We can identify either the office number or the phone as the primary key. Are these durable choices?

Primary keys

  • Problems exist with using numbers or text found in nature as a primary key
  • Things change—phone numbers change or a second phone is added to a room and sometimes people shuffle offices or double up in an office
  • It usually makes more sense to make up a primary key—in fact it is quite easy in most dbms packages to autoincrement an integer as the primary key each time a row is addd to a table
  • Sometimes we create a concatenated key, also known as a composite key, consisting of several columns together, especially in the case of associative entities, entities that link two other entities

Functional Dependencies

  • In any relation R, a set of attributes B is functionally dependent on another set of attributes A if, at each instant of time, each A value is associated with only one B value. Such a functional dependence is written {A} \(\rightarrow\) {B} and may be read A determines B. The left hand side is called the determinant.
  • Example: Given the above Office relation, A is Room 101 and B is Winston Smith. There is only one occupant per room so knowing the room number determines the occupant.
  • A functional dependency is considered trivial if the LHS is a superset of the RHS. If so, it has the form {A,B}\(\rightarrow\){B}. In other words, if A does not determine B we could still say that A and B together determine B but it would be trivial to say so.

Foreign keys

  • Think back to the flights table we saw in 02aTerms
  • The airports table contained a primary key column called faa, which is a three-letter code for an airport. For example, Austin’s faa code is AUS
  • The flights table contained two columns with airport codes: origin and dest. Each of these is a foreign key when it appears in the flights table
  • It can be used, for example, to get the full name of an airport to replace the three-letter code in a message
  • The foreign key is usually denoted in italics or oblique type. If these are not available, a substitute is to write (fk) after the name

END

Colophon

This slideshow was produced using quarto

Fonts are Roboto, Roboto Light, and Victor Mono Nerd Font