database: a collection of data organized in a structured way
relational database: a database that stores data in self-describing tables with relationships between them
relation: a table in a relational database
tuple: a row in a table; may also refer to an entire table
attribute: a characteristic or property of an entity (synonmous with field or column)
row: a horizontal entity in a table
column: a vertical entity in a table
table: a collection of rows and named columns
What is a relation (table)?
A table in a relational database
So, what is a table?
A collection of rows and named columns
A table looks like a spreadsheet but is constrained by more rules
The order of the rows must not matter—a database administrator may choose to reorder the rows and no program that accesses the database should be affected
The intersection of a row and a column is called a cell
No two columns in a table can have the same name (but it’s quite common for a column’s name to be repeated in different tables of the same database)
Example: a books table (relation)
What is a relational database?
A set of self-describing tables with relationships between them
Self-describing: For each table in the database there is a separate table or tables that describes that table’s structure
Example: the nycflights13 database, summarized visually
More advanced terms
schema: the structure of a database—tables, columns, relationships
data type: a characteristic of data, such as number, text, or date
cardinality: the number of rows in a table
domain: the set of values that an attribute can take
primary key: a unique identifier for each row in a table (can be a single column if that column has all unique values or a concatenated key of multiple columns if no single column has all unique values)
candidate key: a column or set of columns that can uniquely identify a row in a table (a primary key is a candidate key, perhaps the only one or one of several)
foreign key: a column that references the primary key of another table
Other advanced terms
query: a request for data or information from a database
SQL: a standard language for managing and manipulating relational databases
transaction: a sequence of operations performed as a single logical unit of work (transactions are the main purpose of most databases throughout history)
OLTP: online transaction processing (a database that is optimized for handling many small, fast transactions)
OLAP: online analytical processing (a database that is optimized for handling large, complex queries)
index: a data structure that improves the speed of data retrieval operations on a database table
Still other advanced terms
normalization: the process of organizing data in a database to reduce redundancy and improve data integrity
denormalization: the process of adding redundant data to a database to improve query performance
join: a database operation that combines rows from two or more tables based on a related column between them
view: a virtual table based on the result-set of an SQL statement
stored procedure: a set of SQL statements that are stored in the database and can be executed as a single unit
source of truth: a durable, reliable enterprise database upon which other databases can be constructed
Still other other advanced terms
trigger: a set of SQL statements that are automatically executed in response to certain events on a table or view in a database
backup: a copy of data that can be used to restore the original data in case of loss or corruption
recovery: the process of restoring data to a consistent state after a failure or error
ACID: a set of properties that guarantee database transactions are processed reliably (ACID stands for atomicity, consistency, isolation, and durability)
constraint: a rule that enforces data integrity in a database
anomaly: a situation in which data inconsistencies occur due to poor database design; can be one of three kinds: insertion, update, or deletion anomaly
END
Colophon
This slideshow was produced using quarto
Fonts are Roboto, Roboto Light, and Victor Mono Nerd Font