 
 
In the module on "Database Introduction," we gave an overview 
of relational databases. In the current module, we develop this data model more 
carefully and include its mathematical basis and notation. For clarity, we will 
repeat and expand some definitions. 
The mathematical structure of relational databases results 
in great flexibility than some other database models by allowing queries that 
the database designers did not originally envision. At the heart and even the 
name of relational databases is the concept of relation, which depends on the 
concepts of ordered tuples and Cartesian product. Some mathematics course use 
the concept of ordered pairs of numbers: When asked to graph y = x2 
over the set of real numbers, Â, we plot S = 
{(x, y) | y = x2, x  Â}, 
the set of all ordered pairs of real numbers where the second coordinate is the 
square of the first. The ordered pair (3, 9)
 Â}, 
the set of all ordered pairs of real numbers where the second coordinate is the 
square of the first. The ordered pair (3, 9)  S, 
because 32 = 9; but (9, 3) Ï S, because 
92  3. S is a subset of the Cartesian product  
´ Â, which is the set of all ordered pairs of elements where the first 
and second elements are from Â.
 S, 
because 32 = 9; but (9, 3) Ï S, because 
92  3. S is a subset of the Cartesian product  
´ Â, which is the set of all ordered pairs of elements where the first 
and second elements are from Â. 
The first and second sets of the cross product do not 
have to be the same. Suppose two students are picked at random as subjects for 
a science class to examine various characteristics. The set P = {SubjectA, 
SubjectB} holds the designations for the subjects. One characteristic to examine 
is whether a person is Rh+ or Rh-. We define the set Rh 
= {+, -}. An ordered pair consisting of the person's designation followed by + 
or - indicates the subject and his or her Rh type. A listing of all possible pairs 
is the Cartesian product of P and Rh:
 
  
  P´Rh = {(SubjectA, +), (SubjectA, -), 
  (SubjectB, +), (SubjectB, -)}.
 
We can take the Cartesian product of several sets. For 
example, another characteristic is the blood type of M, N, or MN. We let T 
= {M, N, MN}. The Cartesian product of P, Rh, and T, P´Rh´T, 
has (2)(2)(3) = 12 elements and is
   
    | P´Rh´T 
      = | {(SubjectA, +, M), (SubjectA, +, N), (SubjectA, +, MN), (SubjectA, -, M), (SubjectA, -, N), (SubjectA, -, MN),
 (SubjectB, +, M), (SubjectB, +, N), (SubjectB, +, MN),
 (SubjectB, -, M), (SubjectB, -, N), (SubjectB, -, MN)}.
 | 
 
   
    | Definition | Let A1, A2, A3,..., An 
      be sets. The Cartesian 
       or cross 
      product   of these sets is 
 
   That is, the Cartesian product of sets is the set 
      of all ordered n-tuples or tuples 
      where each coordinate comes from the corresponding set.
 | 
   
    |  | 
   
    | The Cartesian product and the Cartesian coordinate system are named after 
      René Descartes, who invented 
      analytical geometry in the 17th century. Descartes, of fragile 
      health, was fond of spending most of the morning in bed, resting and thinking. 
      This habit of a lifetime was disrupted when he was invited by the Queen 
      of Sweden to Stockholm. Her demands to work with him in the early morning 
      are said to have contributed to his premature death within a few months 
      from pneumonia. | 
In mathematics or relational databases, a relation 
is just a subset of the Cartesian product of sets; it has some, but not necessarily 
all, the elements in the Cartesian product. Thus, {(x, y) | y 
= x2, x  Â} 
is a relation on  ´ Â, and this set indicates how 
y is related to x. The set CurrentBioStudy = {(SubjectB, 
-, N), (SubjectA, +, M)}, giving specific relationship among the people, their 
Rh factor, and their blood types, is a subset of the Cartesian product of P, 
Rh, and T, P´Rh´T, 
and thus is a relation on P´Rh´T. 
This relation has degree 3 because the relation is a subset of the Cartesian 
product of 3 sets.
 Â} 
is a relation on  ´ Â, and this set indicates how 
y is related to x. The set CurrentBioStudy = {(SubjectB, 
-, N), (SubjectA, +, M)}, giving specific relationship among the people, their 
Rh factor, and their blood types, is a subset of the Cartesian product of P, 
Rh, and T, P´Rh´T, 
and thus is a relation on P´Rh´T. 
This relation has degree 3 because the relation is a subset of the Cartesian 
product of 3 sets. 
   
    | Definition | Let A1, A2, A3,..., An 
      be sets. A relation on  is a subset of  . The degree 
      of this relation is n. | 
We can picture a relation as a two-dimensional table, 
such as follows: 
 
   
    | CurrentBioStudy | 
   
    | subject | RhFactor | BloodType | 
   
    | SubjectB | + | N | 
   
    | SubjectA | + | M | 
A tuple, such as (SubjectB, -, N), corresponds to a row of the table. A column 
heading, such as RhFactor, is an attribute; and the data type or 
the set of all possible values of a possible values for an attribute, such as 
{+, -} for RhFactor, is the domain of the attribute. Moreover, the 
values in a domain must be atomic or indivisible with respect to the model. 
Thus, a cell in a table contains a single value, such as +, that we do not subdivide. 
In terms of an ER diagram, a tuple is an entity with three attributes, as the 
following illustrates: 
   
    | Definition | An attribute corresponds 
      to a column in a relation. A value is atomic 
      if it is indivisible. The domain of 
      an attribute is the set of all possible values of that attribute; each value 
      must be atomic. | 
A relation, such as CurrentBioStudy, 
is a particular instance of a relation schema, which we might call BioStudy_schema. 
For a relation or table, the relation schema describes the column headings for 
a table. The schema (such as BioStudy_schema) is a list of attributes (such 
as subject, RhFactor, and BloodType) with their corresponding 
domains (such as string, {-, +}, and {M, N, MN}). (xxx should types be 
char and string instead of an enum type?) We can denote the relation schema BioStudy_schema 
as the name with the attributes and their domains in parentheses, as follows: 
 
  
  BioStudy_schema = (subject:string, RhFactor:{+, -}, BloodType:{M, 
  N, MN}) 
 
For the tuple t = (SubjectB, +, N) in relation CurrentBioStudy, 
"SubjectB" is the value corresponding to the subject attribute; the value 
of the RhFactor attribute is +; and "N" is the instance of the BloodType 
attribute. 
   
    | Definition | A relation schema S 
      is a list of attributes, A1, A2, A3,..., 
      An, with their corresponding domains, D1, 
      D2, D3,..., Dn, written 
      as 
 S = (A1:D1, A2: 
        D2, A3: D3,É, An: 
        Dn)
 | 
 
   
    | Quick Review Questions | 
   
    | A psychology relational database stores data 
      from a number of experiments. One conditioning experiment with rats involves 
      two-lever boxes. The following is a list of possible events and their event 
      codes: 
 
 
         
          | Code | Event |   
          | 1 | Left press |   
          | 2 | Right press |   
          | 3 | Turn on all panel lights |   
          | 4 | Turn off all panel lights |   
          | 5 | Pellet delivery |   
          | 6 | Retract right lever |   
          | 7 | Insert right lever |   
          | 8 | Pulse tone on |   
          | 10 | End of session |  
 The computer monitors the these events along with 
      the times of occurrence Data from a session might be as follows:
 
 
  
        Each rat has its own integer identification number, 
      such as 23. 
           
            
		2     0.10
 		2     1.48
		5     1.48
		2     3.57
		1     4.72
		5     4.72
		.
 		.
		.
		2  1334.25
		2  1335.40
		1  1336.72
		1  1337.10
		5  1337.10
		10  1337.71
					 
 
 
         
          | Quick Review Question 1 
            Complete the relation... 
 PsyExp_schema = (EventNumber: a , time: b , SubjectID: 
            c )
 
 |   
          | Quick Review Question 2 Give 
            a schema, PsyEventCode_schema, for relations that store events 
            (event) along with their codes (code). Have exactly one blank before 
            and after = and after each comma. |  |  | 
 
  Properties of Relations 
The following is a list of properties of relations:
  - A value of an attribute is atomic.
- Attribute names in a relation are distinct.
- The order of tuples in a relation is irrelevant. Thus, because CurrentBioStudy 
    is a set, we have the same relation if (SubjectB, -, N) is the first row of 
    the table CurrentBioStudy and (SubjectA, +, M) is the second row, or 
    vice versa.
- Tuples are distinct. Thus, a table cannot have two identical rows.
In the relation schema BioStudy_schema, 
each human subject has a unique designation, or unique value of subject. 
We say that the attribute subject is a superkey. A superkey 
is an attribute or set of attributes that uniquely identifies a particular tuple 
in a relation. 
We could use {subject, RhFactor}, {subject, 
BloodType}, or even {subject, RhFactor, BloodType} as a superkey because 
the values of the attributes uniquely identifies a tuple. When several attributes 
make up the superkey, we call the superkey a composite key. As another 
example, suppose instead of using generic titles, such as SubjectA, we use the 
subjectÕs first, middle, and last names. Assuming no two people have the same 
name, the set of attributes, {FirstName, MiddleName, LastName}, forms a 
composite key. In this case, this superkey is a candidate key because no 
proper subset is a superkey. In the earlier version, {subject, RhFactor} 
is not a candidate key because the proper subset {subject} is a key. A 
candidate key has the properties of uniqueness and irreducibility. The key uniquely 
identifies a tuple, and no proper subset uniquely identifies a tuple. 
A relation can have more than one candidate key. For example, 
our relation could also have attributes for the subject as well as the name. The 
attribute subject and the set of attributes {FirstName, MiddleName, 
LastName} are both candidate keys. If we use subject to uniquely identify 
tuples, then subject becomes the primary key or key. In fact, 
we would be wise to pick SSN instead of {FirstName, MiddleName, LastName} 
because of the possibility that for some class two people could have the same 
name. In a relation schema, we underline the primary key, such as follows:
  
 
  
  (subject, FirstName, MiddleName, LastName, RhFactor, BloodType)
Because we use the primary key to identify tuples, clearly 
the primary key cannot have a value of null. We call this requirement the entity 
integrity constraint.
   
    | Definition | A superkey is an attribute 
      or set of attributes that uniquely identifies a particular tuple in a relation. 
      A candidate key is a superkey in which 
      no proper subset is a superkey. The primary key 
      or key is the candidate key that we 
      choose to uniquely identify tuples. The entity 
      integrity constraint requires that no primary key value be null. | 
 
   
    | Example 1 (Idea for Example 
      from Teacher's Manual to Accompany Laboratory Manual General Chemistry Principles 
      and Structure, Jo A. Beran and James E. Brady, John Wiley & Sons, Inc., 
      New York, 1978, p. 76) 
 The molecular weight of a non-volatile nonelectrolyte 
      can be determined by dissolving this solute in an organic solvent, such 
      as benzene, which has a freezing point of f.p. = 5.5oC and a 
      freezing point depression constant of Kf = 5.12oC kg/mole. (The freezing point depression constant for a solvent is how much the freezing 
      point of a solution is lowered when one kg of solvent is dissolved in one 
      mole of the solute.) We record temperature versus time for the solution 
      as it cools. The temperature when the graph levels off is the freezing point. 
      For example, suppose we dissolve a 13.5-gram sample of nonelectrolyte in 
      100 grams of benzene, and the solution freezes at 2.95oC. The 
      molecular weight of the nonelectrolyte is 234 g/mole, as the following shows:
 
 
   Suppose a database stores experimental data and 
      known molecular weights for a number of compounds. One relation, ChemKnowns, 
      stores the names of compounds (KnownCompound) and their floating 
      point molecular weights (MolecularWeight). Consequently, the relation 
      schema for ChemKnowns is
 
 ChemKnowns_schema = (KnownCompound:string, MolecularWeight:float)
 Because each compound has a unique name, the key 
      for the ChemKnowns relation is the attribute KnownCompound. 
      A partial listing of the relation is follows:
 
 ChemKnowns
 
         
          | KnownCompound | MolecularWeight |   
          | Anthracene | 178.23 |   
          | Benzene | 78.11 |   
          | Benzil | 210.23 |   
          | Benzophenone | 182.22 |   
          | Biphenyl | 154.21 |   
          | p-Dichlorobenzene | 147.00 |  Although the tuples appear with the key in alphabetical order, there is 
      no requirement that they be in such an order.
 
 Another relation, ChemSolvents, 
      stores the names of possible organic solvents (SolventName) along 
      with each freezing point (FreezingPoint) and freezing point depression 
      constant (Kf). The relation schema follows:
 
 ChemSolvents_schema = (SolventName: string, FreezingPoint: 
        float, Kf:float)
 Again, the name attribute, SolventName, is 
      the key for the relation. A partial list of this relation follows:
 
 ChemSolvents
 
         
          | SolventName | FreezingPoint | Kf |   
          | Benzene | 5.5 | 5.12 |   
          | Cyclohezane | 6.5 | 20.2 |   
          | p-Dichlorobenzene | 53 | 7.1 |   
          | Glacial acetic acid | 17 | 3.9 |  Although the solvent name "Glacial acetic acid" has three words, the value 
      is atomic. In the relation, we do not split the value into three parts.
 
 The experiments have 
      distinct identification numbers, such as 10. To determine the molecular 
      weight of an unknown solute, we record the mass of the solute in grams, 
      solvent name and its mass in kg, and the freezing point of the solution. 
      We have a relation, ChemExperiments, for this data with schema
 
 ChemExperiments_schema = (ExpNumber:integer, SoluteMass:float, 
        SolventName:string, SolventMass: float, FP: float)
 Notice that SolventName is the key in the 
      relation ChemSolvents. In this case, we call SolventName a 
      foreign key in the relation ChemExperiments. Even if the attribute 
      names had not been identical, they would match. For example, we could refer 
      to CompoundName in ChemSolvents and SolventName in 
      ChemExperiments, but SolventName would still be a foreign 
      key.
 
 By the referential integrity constraint, 
      any value (other than null) for a foreign key must appear as a candidate 
      key value in the other relation. A tuple in one relation refers to a tuple, 
      which must exist, in another relation. We indicate the referential integrity 
      constraint on the database schema by drawing an arrow from the foreign key 
      to the attribute in the relation that it references, such as follows:
 
 ChemKnowns_schema
 
         
          | KnownCompound | MolecularWeight |  ChemSolvents_schema
 
         
          | SolventName | FreezingPoint | Kf |  ChemExperiments_schema
 
         
          | ExpNumber | SoluteMass | SolventName | SolventMass | FP |  
 Later, we will use such foreign keys to make connections 
      among the data. We will return to this example as we develop additional 
      database concepts.
 | 
   
    | Definition | A foreign key is an attribute 
      or set of attributes in a relation that matches a candidate key in a relation. 
      The referential integrity constraint 
      states if relation B has a foreign key k from relation A, 
      then each value of k in B (other than null) must appear as 
      a value for the corresponding candidate key in some tuple of A. | 
 
  Selection Operation 
One of the most common operations on a relation is to 
select a subset of the tuples or rows. This selection operation, symbolized 
with the lowercase Greek letter s (s), employs a selection condition or 
criterion for determining which tuples to choose. The general notation for the 
operation is sselection condition(relation). 
For example, to retrieve the molecular weight of Benzene, we select the tuple 
from the ChemKnowns relation that has a KnownCompound 
value of "Benzene". We can write the operation mathematically as sKnownCompound 
= Benzene(ChemKnowns). The operation will return the tuple (Benzene, 
78.11). 
 
  Projection Operation 
Instead of picking a subset of the rows, we often need 
to select a subset of columns or attribute values. To do so, we use the projection 
operation, which we symbolize using the lowercase Greek letter p. 
For example, to obtain a list of the compounds whose molecular weight is in the chemistry database, we have the notation pKnownCompound(ChemKnowns). 
The operation returns a relation with only the KnownCompound attribute 
column, as follows: 
  pKnownCompound(ChemKnowns) 
   
    | KnownCompound | 
   
    | Anthracene | 
   
    | Benzene | 
   
    | Benzil | 
   
    | Benzophenone | 
   
    | Biphenyl | 
   
    | p-Dichlorobenzene | 
Suppose we want a table of the solvent names and their 
freezing points, such as follows:
   
    | SolventName | FreezingPoint | 
   
    | Benzene | 5.5 | 
   
    | Cyclohezane | 6.5 | 
   
    | p-Dichlorobenzene | 53 | 
   
    | Glacial acetic acid | 17 | 
In this case, we list both attributes in the subscript of p and have the relation 
name ChemSolvents in parentheses: pSolventName,FreezingPoint(ChemSolvents). 
If we choose to name this new table SolventFP, we can employ the following 
left-arrow notation: 
  
  SolventFP  ¬ pSolventName,FreezingPoint(ChemSolvents) 
 
  Union Operations 
We can perform set operations, such as union, intersection, 
difference, and Cartesian product, on relations. For the first three operations, 
we need relations with tuples that are of the same type, in which case the relations 
are union compatible. To be union compatible, the relations must have the 
same number of attributes and corresponding domains must be the same. However, 
corresponding attribute names do not have to be identical. 
   
    | Definition | Two relations R and S are union 
      compatible if they have the same degree n and corresponding 
      attributes have the same domain. | 
The relations ChemKnowns 
and ChemSolvents are not union compatible because 
the former has two attributes, while the later has three. However, the projections 
of these relations to their first attributes with domains of string are union 
compatible. We call these new relations, which follow, KnownNamesTable 
and SolventNamesTable.
  KnownNamesTable ¬ pKnownCompound(ChemKnowns) 
  
  SolventNamesTable ¬ pSolventName(ChemSolvents
   
    | KnownNamesTable | SolventNamesTable | 
   
    | KnownCompound | SolventName | 
  
   
    | Anthracene | Benzene | 
   
    | Benzene | Cyclohezane | 
   
    | Benzil | p-Dichlorobenzene | 
   
    | Benzophenone | Glacial acetic acid | 
   
    | Biphenyl |  | 
   
    | p-Dichlorobenzene |  | 
The union,  , of two 
union-compatible relations is a relation consisting of all the tuples from the 
relations with duplicates omitted. For example, to obtain the names of all compounds, 
knowns and solvents, in the (yyy link) chemical database, we perform the 
union of KnownNamesTable and SolventNamesTable, KnownNamesTable
, of two 
union-compatible relations is a relation consisting of all the tuples from the 
relations with duplicates omitted. For example, to obtain the names of all compounds, 
knowns and solvents, in the (yyy link) chemical database, we perform the 
union of KnownNamesTable and SolventNamesTable, KnownNamesTable 
 SolventNamesTable, which yields the following 
relation:
  SolventNamesTable, which yields the following 
relation: 
  KnownNamesTable  SolventNamesTable
  SolventNamesTable 
   
    | Anthracene | 
   
    | Benzene | 
   
    | Benzil | 
   
    | Benzophenone | 
   
    | Biphenyl | 
   
    | p-Dichlorobenzene | 
   
    | Cyclohezane | 
   
    | Glacial acetic acid | 
Notice that Benzene and p-Dichlorobenzene each appear in both tables but are present 
only once in the union because duplicate tuples do not appear. 
   
    | Definition | The union of two union-compatible 
      relations R and S, written R  S, is a relation consisting of all the tuples from the relations 
      with duplicates omitted. | 
 
  Intersection Operation 
Suppose we want a list of solvents for which we have the 
molecular weight information that appears in ChemKnowns. 
In this case, we perform the intersection of KnownNamesTable and SolventNamesTable,
KnownNamesTable  SolventNamesTable, which 
yields the following relation:
  SolventNamesTable, which 
yields the following relation: 
  KnownNamesTable  SolventNamesTable
  SolventNamesTable 
   
    | Benzene | 
   
    | p-Dichlorobenzene | 
   
    | Definition | The intersection 
      of two 
      union-compatible relations R and S, written R  S, is a relation consisting of only the tuples that appear in both 
      relations. | 
 
  Difference Operation 
Perhaps, we need to know the solvents for which the database 
does not store their molecular weights. In this case, we start with the SolventNamesTable 
and take out the solvent names that also appear in the KnownNamesTable. 
This set difference operation, KnownNamesTable - SolventNamesTable, returns 
the following relation that does not contain Benzene or p-Dichlorobenzene: 
  KnownNamesTable - SolventNamesTable 
   
    | Anthracene | 
   
    | Benzil | 
   
    | Benzophenone | 
   
    | Biphenyl | 
   
    | Definition | The difference of two 
      union-compatible relations R and S, written R - S, 
      is a relation consisting of the tuples that appear in R but not in S. | 
 
  Cartesian Product Operation 
The set operation of Cartesian 
product does not require that the operands be union compatible. For example, the 
relation ChemSolvents has three attributes 
(SolventName, FreezingPoint, and Kf), and the relation ChemKnowns 
has two (KnownCompound and MolecularWeight). The Cartesian product 
of ChemSolvents and ChemKnowns, ChemSolvents  ChemKnowns, has all five attributes. Every one of the 4 tuples in ChemSolvents 
is combined with every one of the 6 tuples of ChemKnowns to form a relation 
with 24 tuples. This new relation, CrossTable, follows in part:
 
 ChemKnowns, has all five attributes. Every one of the 4 tuples in ChemSolvents 
is combined with every one of the 6 tuples of ChemKnowns to form a relation 
with 24 tuples. This new relation, CrossTable, follows in part: 
  CrossTable <-- ChemSolvents  ChemKnowns
  ChemKnowns 
   
    | SolventName | FreezingPoint | Kf | KnownCompound | MolecularWeight | 
   
    | Benzene | 5.5 | 5.12 | Anthracene | 178.23 | 
   
    | Benzene | 5.5 | 5.12 | Benzene | 78.11 | 
   
    | Benzene | 5.5 | 5.12 | Benzil | 210.23 | 
   
    | Benzene | 5.5 | 5.12 | Benzophenone | 182.22 | 
   
    | Benzene | 5.5 | 5.12 | Biphenyl | 154.21 | 
   
    | Benzene | 5.5 | 5.12 | p-Dichlorobenzene | 147.00 | 
   
    | Cyclohezane | 6.5 | 20.2 | Anthracene | 178.23 | 
   
    | Cyclohezane | 6.5 | 20.2 | Benzene | 78.11 | 
   
    | . .
 .
 | . .
 .
 | . .
 .
 | . .
 .
 | . .
 .
 | 
   
    | Definition | The Cartesian product 
      of two relations R and S, written R  S, is a relation consisting of the tuples with attributes from R 
      and S, where each tuple of R is paired with every tuple of S. | 
However, we probably are only interested in the subset 
or selection of the tuples in the CrossTable 
relation where the SolventName and KnownCompound agree. We store 
these in a relation KnownSolventTable, as follows: 
 
  SelectCrossTable <-- sSolventName 
  = KnownCompound(CrossTable) 
   
    | SolventName | FreezingPoint | Kf | KnownCompound | MolecularWeight | 
   
    | Benzene | 5.5 | 5.12 | Benzene | 78.11 | 
   
    | p-Dichlorobenzene | 53 | 7.1 | p-Dichlorobenzene | 147.00 | 
To eliminate the duplicate columns, we perform a projection 
with all of the attributes except KnownCompound, as follows: 
  KnownSolventTable <-- pSolventName,
  FreezingPoint, Kf, MolecularWeight(SelectCrossTable) 
   
    | SolventName | FreezingPoint | Kf | MolecularWeight | 
   
    | Benzene | 5.5 | 5.12 | 78.11 | 
   
    | p-Dichlorobenzene | 53 | 7.1 | 147.00 | 
This process of performing the Cartesian product, selecting 
tuples that match in a particular attribute, and projecting to eliminate duplicate 
columns is so common that another relational database operation, join, accomplishes 
the result in one step. 
 
  Join Operation 
We just used the Cartesian product and selection to obtain 
a table SelectCrossTable of related tuples 
from two relations. Because of its utility, such a sequence of operations is combined 
into a single operation, join. The join of relations R and S, 
written R|X|conditionS, is the set of tuples 
from the Cartesian product of R and S that satisfy condition. The 
condition compares an attribute of R with an attribute of S using 
one of the relational operators (=, , <, ², >, ³). Thus, to generate the 
relation SelectCrossTable, we perform the join operation 
 
  SelectCrossTable <-- ChemSolvents|X|SolventName 
  = KnownCompoundChemKnowns 
The join where the condition is a test of equality is 
used so often that we give it a special name, equijoin. As with the Cartesian 
product followed by selection, the equijoin generates a table with two identical 
columns. We eliminated this duplication above by projecting the table on all attributes 
except the repeated one(s). Equijoin with elimination of duplication is so common 
that we give the operation a special name, natural join, and notation, 
*. Thus, to obtain KnownSolventTable from ChemSolvents and ChemKnowns, 
we can perform the natural join, as follows: 
 
  KnownSolventTable ¬ChemSolvents *SolventName 
  = KnownCompound ChemKnowns
   
    | Definition | The join of relations
      R and S, written R |X|condition 
      S, is the set of tuples from the Cartesian product of R and S 
      that satisfy condition. The condition compares an attribute of R 
      with an attribute of S using one of the relational operators (=, , <, ², >, ³). 
      An equijoin of relations is a join where condition 
      is a test of equality. A natural join 
      of R and S, written R *  RA = SA 
      S, is the projection of R |X| RA = SA S to 
      eliminate duplicate attributes. | 
 
   
    | Quick Review Question | 
   
    | Quick Review Quesion 15   
      From the psychology database, relation EventCode is the table of Quick 
      Review Question 1 and relation. 
 
 
         
          | session |   
          | EventNumber | time | SubjectID |   
          | 2 | 1.22 | 3 |   
          | 5 | 2.71 | 3 |   
          | 1 | 3.02 | 3 |   
          | 2 | 5.66 | 3 |   
          | 5 | 5.91 | 3 |   
          | 2 | 7.55 | 3 |   
          | 10 | 8.00 | 3 |  
 | 
 
  Exercises 
1. Consider the psychology relational database of QRQ 1 
and QRQ 15. Give the relational operation and the resulting 
relation for each of the following queries. 
 a. Retrieve the meaning of code 4. 
  b. For relation session, obtain a list of all event codes. 
  c. For session, list all events that occurred. 
  d. Give a table of all the tuples in session where the event occurred 
  before time 4.00. 
2. Consider the biology relational database with relation schema BioStudy_schema 
and relation CurrentBioStudy from the beginning 
of this module. Suppose schema details_schema is as follows: 
 
 details_schema = (DetailSubject:string,
  sex:{F, 
  M}) 
Relation CurrentDetails follows: 
   
    | DetailSubject | Sex | 
   
    | SubjectA | M | 
   
    | SubjectB | F | 
   
    | SubjectC | F | 
   
    | SubjectD | M | 
Give the relational operation and the resulting relation for each of the following 
queries. 
a. Generate a relation with the information from CurrentBioStudy 
along with the sex of the subject. 
b. Obtain a list of Rh- female subjects in the current study. 
Suppose PreviousBioStudy has schema BioStudy_schema and is the following: 
   
    | Subject | RhFactor | BloodType | 
   
    | SubjectD | + | MN | 
   
    | SubjectC | - | M | 
   
    | SubjectB | + | N | 
   
    | SubjectE | + | N | 
c. Take the union of PreviousBioStudy and CurrentBioStudy. 
d. Take the intersection of PreviousBioStudy and CurrentBioStudy. 
e. Take the difference PreviousBioStudy - CurrentBioStudy. 
f. Take the difference CurrentBioStudy - PreviousBioStudy. 
3. Consider the chemical relational database of the Example 
in this module. Give the relational operation and the resulting relation for each 
of the following queries. 
a. Obtain the molecular weight of Anthracene. 
b. Return a relation with the information of ChemExperiments along 
with the freezing point and Kf of the solvent. 
4. Suppose relations exp25 and exp3 have schemas PsyExp_schema 
in the psychology database of QRQ 1. An abbreviated listing 
of the relations follows: 
   
    |  | exp25 |  |  |  | exp3 |  | 
   
    | EventNumber | time | SubjectID |  | EventNumber | time | SubjectID | 
   
    | 2 | .10 | 25 |  | 2 | 1.22 | 3 | 
   
    | 2 | 1.48 | 25 |  | 5 | 2.71 | 3 | 
   
    | 5 | 1.48 | 25 |  | 1 | 3.02 | 3 | 
   
    | 2 | 3.57 | 25 |  | 2 | 5.66 | 3 | 
   
    | 1 | 4.72 | 25 |  | 5 | 5.91 | 3 | 
   
    | 5 | 4.72 | 25 |  | 2 | 7.55 | 3 | 
   
    | 2 | 6.81 | 25 |  | 10 | 8.00 | 3 | 
   
    | 1 | 7.63 | 25 |  |  |  |  | 
  
   
    | 10 | 8.00 | 25 |  |  |  |  | 
a. How many attributes are in exp25  exp3?
 
 exp3? 
b. How many tuples are in this union? 
c. Give the intersection. 
d. Give exp25 - exp3. 
e. Give exp3 - exp25. 
f. How many attributes are in the Cartesian product? 
g. How many tuples are in the Cartesian product? 
 
  Projects 
1. Draw a relational database schema for the physics 
database in the Entity 
Relation Module. 
2. Below is a description of a file that holds star data. (The description 
and data are from Dr. Daniel Welch, Wofford College.) Give a relational database 
schema for the data.
The first few data items are as follows: 
AND,3,f|S|K0,23:04:11,50:03,4.64,2000
AND,51,f|S|K3,01:37:59,48:37,3.57,2000
AND,58,f|S|A5,02:08:29,37:51,4.8,2000
AND,60,f|S|K4,02:13:13,44:14,4.83,2000
AND,65,f|S|K4,02:25:37,50:17,4.72,2000
AND,7,f|S|F0,23:12:33,49:25,4.52,2000
AND,8,f|D|M2,23:17:44,49:01,4.83,2000
AND,Alpheratz,f|D|B8,00:08:23,29:05,2.07,2000
AND,Chi,f|S|G8,01:39:21,44:24,4.98,2000
AND,Delta,f|D|K3,00:39:20,30:52,3.25,2000
A description of the data follows: 
 *
* Each major category begins with a line starting with # for easy searching.
*
* Each object occupies one line. Fields are separated with commas. Some fields
* are further subdivided into subfields with vertical bars (|). Lines
* beginning with anything other than a-z, A-Z or 0-9 are ignored and may be
* used for comments. Objects may be in any order.
*
* all date fields may be in either of two forms:
* 1) month/day/year, where day may contain a trailing decimal portion.
* examples: 1/1/1993 and 1/1.234/1993
* 2) a real-number, such as 1993.123.
*
* The first three fields are always Constellation's Name, Object's Name, and Type.
* Remaining fields depend on the first entry within the Type field.
*
* The first field abbreviates the constellation name, such as AND for Andromeda.
* The second field is the star name within the constellation.
*
* The Type field always starts with a single letter designating the form of
* the object's motion:
* f: fixed (no proper motion)
* e: heliocentric elliptical orbit
* h: heliocentric hyperbolic orbit
* p: heliocentric parabolic orbit
*
* if "Type" is fixed, an object class code may follow in the next subfield:
* C: Cluster, globular
* U: Cluster, with nebulosity
* O: Cluster, open
* G: Galaxy, spiral
* H: Galaxy, spherical
* A: Cluster of galaxies
* N: Nebula, bright
* F: Nebula, diffuse
* K: Nebula, dark
* P: Nebula, planetary
* Q: Quasar
* T: stellar object
* B: Star, binary
* D: Star, double
* M: Star, multiple
* S: Star
* V: Star, variable
*
* if "Class" is one of T, B, D, S or V, the spectral class and possibly the
* numerical subclass designation may follow in the next subfield:
* O, B, A, F, G, K, M, N, C, S
*
* Following Type, the remaining fields are defined as follows:
*
* elliptical format (e < 1):
* i = inclination, degrees
* O = longitude of ascending node, degrees
* o = argument of perihelion, degrees
* a = mean distance (aka semi-major axis), AU
* n = mean daily motion, degrees per day (computed from a**3/2 if omitted)
* e = eccentricity,
* M = mean anomaly (ie, degrees from perihelion),
* E = epoch date (ie, time of M),
* D = the equinox year (ie, time of i/O/o).
* g/k or H/G = magnitude model; select which by preceding the first field
* with either a "g" or an "H"; H/G is the default if neither is given.
* s = angular size at 1 AU, arc seconds, optional
*
* hyperbolic format (e > 1):
* T = epoch of perihelion
* i = inclination, degrees
* O = longitude of ascending node, degrees
* o = argument of perihelion, degrees
* e = eccentricity,
* q = perihelion distance, AU
* D = the equinox year (ie, time of i/O/o).
* g/k = magnitude model
* s = angular size at 1 AU, arc seconds, optional
*
* parabolic format (e == 1):
* T = epoch of perihelion
* i = inclination, degrees
* o = argument of perihelion, degrees
* q = perihelion distance, AU
* O = longitude of ascending node, degrees
* D = the equinox year (ie, time of i/O/o).
* g/k = magnitude model
* s = angular size at 1 AU, arc seconds, optional
*
* fixed format:
* ra, hours
* dec, degrees
* magnitude
* reference epoch
* s = angular size, arc minutes, optional
Copyright © 2002, Dr. Angela B. Shiflet
All rights reserved