To indicate which existing database we are we are using, we employ the use statement, such as follows:create database BioStudy;
use BioStudy;
# SQL to create a database for Physics
# laboratory experiments
Definition | An identifier is self-documenting if the name is descriptive of the item's role. |
Quick Review Question 1
Which are the following are not "legal" SQL identifiers?
|
create table CurrentBioStudy ( subject char(10) not null, RhFactor enum('+', '-'), BloodType enum('M', 'N', 'MN'), primary key(subject) ); Alternatively, we can place primary key on the same line as subject, as follows:
create table CurrentBioStudy ( subject char(10) not null primary key, RhFactor enum('+', '-'), BloodType enum('M', 'N', 'MN') ); The name for the new table (CurrentBioStudy) appears after create table. A list of the column names and data types is in parentheses with commas separating the names. The modifier not null indicates that null cannot be a value in the column subject. After the list of column names and types or on the line with subject's declaration, we indicate that this attribute is the primary key. The data type char(10) allows for character strings of exactly 10 characters, padded for blanks as necessary. RhFactor and BloodType are both of enumeration types, and their possible values appear in apostrophes.
Frequently, attributes have numeric types. In a chemical database, attribute FreezingPoint has floating point type float or real, while EventNumber might have integer type int. The modifier unsigned, as in unsigned int, indicates that the value cannot be negative, such as for a counter. Scientific data can be very large or small. For such floating point numbers, we use the type double.
The EventTime attribute for a psychology schema probably would be of type time and would have standard time values. DiscoveryDate would be an attribute of type date, the type for a standard date value.
Attribute subject above is of type char(10), a fixed length string of 10 characters. If the value has less than 10 characters, such as "SubjectA", the computer pads the string with blanks on the end to be exactly 10 characters. Values of greater length are truncated or chopped off to exactly 10 characters.
The type of a variable length string of at most 100 characters is text(100) or varchar(100). Thus, the value "SubjectA" occupies space for only 8 characters. For text(n) the maximum string length n is at most 65,535, while for varchar, n is at most 255. When we know the maximum length of a string, as in the case of subject, we should use char(n) because database searching is faster on fixed length as opposed to variable length strings. Thus, char is used more often than text.
When sorting on a column of type char, text, or varchar, the database ignores case and sorts alphabetically as in a dictionary. Thus, string "anthracene," starting with a lowercase letter, sorts before "Benzene," starting with a uppercase letter, even though uppercase letters occur before lowercase ones in the ASCII encoding scheme. Sorting the strings in C++ using regular ASCII string comparisons, the order would be reversed.
Table 1 summarizes some of the most frequently used data
types.
Table 1. Some SQL data types
Data Type | Meaning |
char(n) | Fixed-length string of n characters, padded with blanks as necessary; n ² 255 |
date | Date; output format YYYY-MM-DD; input format YY-MM-DD, YYYY-MM-DD, YYMMDD, or YYYYMMDD |
double | Double floating point; approximate range ±10308; smallest positive value about 10-308 |
enum | Enumeration; list of values in parentheses, each value in apostrophes |
float or real | Floating point; approximate range ±1038; smallest positive value about 10-38 |
int | Integer; approximate range ±2 billion |
money | Money |
text(n) | Variable-length string of at most n characters; n ² 65,535 |
time | Time; output format HH:MM:SS; input format HH:MM:SS, HHMMSS, HHMM, or HH |
varchar(n) | Variable-length string of at most n characters; n ² 255 |
year | Year; output format YYYY or YY; input format YYYY or YY; YY indicates years between 1970 and 2069 |
create table CurrentBioStudy (
To save space, we should remove tables (drop table) that we no longer use. To drop a table CurrentBioStudy, we use the following statement
drop table CurrentBioStudy;With this statement, all references to the table elsewhere, such as in constraints and foreign keys, are dropped.
|
Quick Review Question 2 Suppose
we want to write a statement to create a table LabSection that has
the following schema: Answer the following questions concerning this task: |
|
||
|
Within a database, we get a listing of the tables, as follows:show databases;
To display the relations in database PsyDB that we are not using, we issue the statementshow tables;
show tables from PsyDB;
The following statements display table LabSection's columns with their data types and the table's keys: show columns from LabSection;
show keys from LabSection;
|
Quick Review Question 4 Type
your answers on one line with one blank between items, using lowercase for
keywords. For each part, write a statement to display
|
alter table CurrentBioStudy change RhFactor RhFactor enum('+', '-') not null;
The last line includes a complete redeclaration of the column. RhFactor appears twice because we have the option of changing the column's name, say to factor, as follows:
To add the column temperature with type float to table PsyExp, we perform the add action, as follows:alter table CurrentBioStudy change RhFactor factor enum('+', '-') not null;
alter table PsyExp add temperature float;
To give the default value of 1 to column EventCode of table PsyExp, we alter the column with a set default, as follows:table PsyExp alter EventCode set default 1;
We can remove the column, temperature, with the drop action, as the following illustrates:
alter table PsyExp drop temperature;
Similarly, we can drop the default value for column EventCode of table PsyExp, as follows:
alter table PsyExp alter EventCode drop default;To rename table PsyExp as PsychologyExp, we write:
alter table PsyExp rename as PsychologyExp;
|
Quick Review Question 5 In the
following questions, use lowercase, except for names, and only one blank
between words. Complete each of the following statements.
|
insert into CurrentBioStudy(subject, RhFactor, BloodType values ('SubjectA','+', 'M');
Alternatively, using a set clause, we can place the values with the column names in any order, as follows:
insert into CurrentBioStudy set RhFactor = '+',BloodType = 'M', subject = 'SubjectA';
In MySQL, if we have a value for every column, we can omit the column names:
insert into CurrentBioStudy values ('SubjectB', '-', 'M');
For an apostrophe to be part of the constant, we employ \' to avoid terminating the string. For example, "Smith's lab" in the values clause is written as 'Smith\'s lab'. To store a backslash, we must write two backslashes, as in 'C:\\main' to represent "C:\main".
If we do not specify a value, the database stores null in that field unless the field has a default value. Also, we can use the constant null as a value, such as follows:
insert into CurrentBioStudy(subject, RhFactor, BloodType) values ('SubjectC', null,'M');
MySQL allows us to insert several tuples at once, as the following illustrates:
insert into CurrentBioStudy values ('SubjectD', null, 'M'), ('SubjectE', '+','MN'), ('SubjectF', '-','N');
|
Quick Review Question 6 Suppose
into LabSection from Quick Review Question
2 we want to insert a row that has the following data:
Phys201, 2004, F, A, 3 |
To load lab.dat from the current local directory, we can issue the following command using local:load data infile '/home/johnson/db/lab.dat' into table LabSection;
load data infile local lab.dat into table LabSection;
If we have a lot of data to add to a table, loading data from a file with the
load statement can be much faster than using the insert statement.
To use the auto_increment feature, we use null or 0 as the value for TeamNumber, as follows:create table team ( TeamNumber int not null primary key auto_increment, TeamGrade float );
If this insertion is the first one, the computer makes the TeamNumber value 1. On subsequent insertions, the value is one more than the largest TeamNumber.insert into team values (null, 86.5);
We use the where clause to indicate particular row(s). Although optional, the where clause occurs frequently in update statements.update CurrentBioStudy set RhFactor = '-', BloodType = 'MN' where subject = 'SubjectA';
Example 1 Suppose
table ChemExperiment of a chemical database has the following schema,
ChemExperiments_schema, from Example
1 of the "Relational Database" module: ChemExperiments_schema
The following statement creates the table: Suppose in Experiment 37, we dissolve a 13.5-gram sample of a nonelectrolyte in 100 grams of benzene, and the solution freezes at 2.95°C. To insert this information into the table, we type the following:create table ChemExperiment ( ExpNumber int not null primary key, SoluteMass float, SolventName char(30), SolventMass float, FP float ); If later we want to change every occurrence of "benzene" in the table to start with a capital letter, we issue an update command, as follows:insert into ChemExperiment values (37, 13.5, benzene, 100, 2.95); If the update is successful, MySQL issues a message that resembles the two lines below. One row was affected, matched, and changed.update ChemExperiment set SolventName = 'Benzene' where SolventName = 'benzene'; If the syntax of the update statement was correct, but the system found no rows that matched the where clause, a message like the following would appear:Query OK, 1 row affected (0.02 sec) Rows matched: 1 Changed: 1 Warnings: 0 If we know the row for which we are searching is in the database, we probably mistyped a data item, such as using Phys203 instead of Phys201.Query OK, 0 row affected (0.01 sec) Rows matched: 0 Changed: 0 Warnings: 0 If a statement has an SQL syntax error, MySQL issues an error message and gives the approximate location of the mistake. To update the quantity of benzene to be 100 grams in every row in which the amount is greater than 100, we use the relational operator greater than (>) and the logical operator && (or and), as follows: update ChemExperiment set SolventMass = 100 where SolventName = 'benzene' && SolventMass > 100; |
|
Quick Review Question 7 Suppose
we want to update the LabSection table row above, (Phys201, 2004,
F, A, 3), to have a DragExpWeek of 2. Type your answers on one line
with one blank between items, using lowercase for keywords.
|
delete from ChemExperiment where ExpNumber = 37;
If we wish to delete every row that has a solute mass less than 1 gram, we write
delete from ChemExperiment where SoluteMass < 1;
We must be very careful of such massive changes that could change a number of rows. It is advisable to make a backup of the database before issuing such commands. Later in this module, we discuss how to make backups. An even more dangerous delete statement does not have a where clause, resulting in removal of every row from the table.
# CAUTION: Removes every row from the table delete from ChemExperiment;
The requested field value (+) appears with the name of the field, such as follows:select RhFactor from CurrentBioStudy where subject = 'SubjectA';
To see the entire record, we use the wild card * to designate all columns, as follows:+----------+ | RhFactor | +----------+ | + | +----------+
The column names with values for SubjectA appear, as follows:select * from CurrentBioStudy where subject = 'SubjectA';
If we wish to see all the data in the table, we use * and omit the where clause:+----------+----------+-----------+ | subject | RhFactor | BloodType | +----------+----------+-----------+ | SubjectA | + | M | +----------+----------+-----------+
The output, which is unordered, might appear as follows:select * from CurrentBioStudy;
To alphabetize the results, we include the order by clause. For example, suppose we want to have a listing of the last names, first names, ids, and grades in the student table of Figure 4 of the "Database Development" module, such as follows:+----------+----------+-----------+ | subject | RhFactor | BloodType | +----------+----------+-----------+ | SubjectA | + | M | | SubjectB | + | N | | SubjectD | - | M | | SubjectE | + | | | SubjectC | - | | +----------+----------+-----------+
+-----------+-----------+-----------+--------------+ | StudentId | LastName | FirstName | StudentGrade | +-----------+-----------+-----------+--------------+ | 107 | Davis | George | A | | 102 | Bolder | Anne | A | | 103 | Donnovan | Patrick | C | | 104 | Davis | Kyle | D | | 105 | d'Antonio | Mary | B | | 106 | Iseman | Bart | F | | 101 | Davis | George | C | +-----------+-----------+-----------+--------------+
We want to alphabetize by the last name. However, if several students have the same last name, we want to alphabetize by their first names. If even they agree, we have the ids appear in sorted order. Although we are displaying all fields, because the output should appear in a different order, we specify the columns, as follows:
Execution of this select statement returns a table, such as the below. "George Davis" with student id 101 occurs before "George Davis" with id 107, and "Kyle Davis" appears after "George Davis".select LastName, FirstName, StudentId, StudentGrade from student order by LastName, FirstName, StudentId;
+-----------+-----------+-----------+--------------+ | LastName | FirstName | StudentId | StudentGrade | +-----------+-----------+-----------+--------------+ | Bolder | Anne | 102 | A | | d'Antonio | Mary | 105 | B | | Davis | George | 101 | A | | Davis | George | 107 | C | | Davis | Kyle | 104 | D | | Donnovan | Patrick | 103 | C | | Iseman | Bart | 106 | F | +-----------+-----------+-----------+--------------+
Example 2 In
the "Relational Database" module, we use the notation (yyy link) sKnownCompound
= Benzene(ChemKnowns) for selection of the tuple from the
(yyy link) ChemKnowns relation that has a KnownCompound
value of "Benzene". In SQL, we write
The select statement also handles the (yyy link) projection operation pKnownCompound(ChemKnowns) to obtain a list of the compounds in ChemKnowns, as follows:select * from ChemKnowns where KnownCompound = 'Benzene'; For a new table, SolventFP, of the solvent names and their freezing points, we use the notationselect KnownCompound from ChemKnowns; In SQL, we create SolventFP and insert the data into it by employing insert and select, as follows:SolventFP ¨ pSolventName,FreezingPoint(ChemSolvents) The number of columns and their types match between the SolventFP table and the projected ChemSolvents table.create table SolventFP ( SolventName char(30) not null primary key, FreezingPoint float ); insert into SolventFP (SolventName, FreezingPoint) select (SolventName, FreezingPoint) from ChemSolvents; |
|
Quick Review Question 8 Consider
table LabSection from Quick Review
Question 2. Type your answers on one line with one blank between
items, using lowercase for keywords.
|
ChemKnowns_schema
KnownCompound MolecularWeight
ChemSolvents_schema
SolventName FreezingPoint Kf
ChemExperiments_schemaSuppose relations ChemKnown, ChemSolvent, and ChemExperiment, respectively, have these schemas. To obtain a listing of solvents with freezing points from ChemSolvent and molecular weights from ChemKnown, we form the natural join, ChemSolvent *SolventName = KnownCompound ChemKnown. In SQL, we employ the select statement, listing both tables and the values that must agree, as follows:
ExpNumber SoluteMass SolventName SolventMass FP
For the data of Example 1 in the "Relational Database" module, only Benzene and p-Dichlorobenzene appear in both tables. Thus, the output is as follows:select SolventName, FreezingPoint, MolecularWeight from ChemSolvent, ChemKnown where SolventName = KnownCompound;
To obtain the set of tuples where each experiment has data from ChemExperiment matched with the data about its solvent from ChemSolvent, we again equijoin on the solvent name. However, the column name is SolventName in both tables. To avoid ambiguity, we employ the dot operator as in C++, designating the tables and column name, ChemExperiment.SolventName and ChemSolvent.SolventName. Thus, to obtain the solute mass, solution freezing point (FP), solvent mass, and solvent constant Kf for Experiment 35, we perform the join as follows:+-------------------+---------------+-----------------+ | SolventName | FreezingPoint | MolecularWeight | +-------------------+---------------+-----------------+ | Benzene | 5.50 | 78.11 | | p-Dichlorobenzene | 53.00 | 147.00 | +-------------------+---------------+-----------------+
select SoluteMass, FP, SolventMass, Kf from ChemExperiment, ChemSolvent where ExpNumber = 35 and ChemExperiment.SolventName = ChemSolvent.SolventName;
|
Quick Review Question 9 In Quick
Review Question 13 of the "Relational Database" module, we considered
relations session (ccc Charles, change time to EventTime
in that module) and EventCode of the following schemas, respectively:
PsyExp_schema = (EventNumber:integer, EventTime:float, SubjectID:integer) PsyEventCode_schema = (code:integer, event:string) EventNumber and code have the same domain. Type your answers on one line with one blank between items, using lowercase for keywords. |
Function | Meaning |
avg(s) | Average of values in s |
count(s) | Number of non-null values in s |
max(s) | Maximum value in s |
min(s) | Minimum value in s |
std(s) | Standard deviation of values in s |
sum(s) | Sum of values in s |
The database does not include null values for DragExpWeek in computation of the average. The results might be reported as follows:select avg(DragExpWeek) from LabSection where CatalogNumber = 'Phys201';
The aggregate function count counts only on the non-null values. To obtain a count of the number of tuples in the relation CurrentBioStudy, even those with null for some but not all fields, we have * as the argument of count, as follows:+------------------+ | avg(DragExpWeek) | +------------------+ | 3.7500 | +------------------+
For a count of the number of Rh+ subjects,select count(*) from CurrentBioStudy;
The modifier distinct appears before the column name to compute the function on only distinct values. Thus, to discover how many different solvents appear in the ChemExperiment relation, we write the following:select count(*) from CurrentBioStudy where RhFactor = '+';
select distinct count(SolventName) from ChemExperiment;
select BloodType, count(BloodType) from CurrentBioStudy group by BloodType;
Because of the amount of computation involved in executing
order or group, the clauses should be used sparingly. Their use
can significantly slow database access in a distributed environment. If many such
summaries are needed, retrieval of the data from the database with summary processing
by a computer program should be considered.
|
Quick Review Question 10 For
the LabSection table, complete
the statement to display each catalog number (CatalogNumber) and
the average week (DragExpWeek) in which the course performs the Coefficient
of Drag Experiment. Use lowercase and one blank between items. |
Function | Meaning |
abs(n) | Absolute value of n |
acos(n) | Arc cosine (inverse cosine of n) |
asin(n) | Arc sine (inverse sine of n) |
atan(n) | Arc tangent (inverse tangent of n) |
concat(s1, s2, ..., sn) | Concatenation of strings s1, s2, ..., sn |
conv(n, b1, b2) | Convert n from base b1 to base b2 |
ceiling(n) | Ceiling of n, smallest integer ≥ n |
cos(r) | Cosine of r, which is in radians |
cot(r) | Cotangent of r, which is in radians |
curdate() | Current date; number YYYYMMDD or string 'YYYY-MM-DD' |
curtime() | Current time; number HHMMSS or string 'HH:MM:SS' |
database() | Name of current database |
adddate(d, i) | Date d plus interval i |
subdate(d, i) | Date d minus interval i |
dayname(d) | Day of week for date d |
dayofmonth(d) | Day of month for date d |
degrees(r) | Degrees corresponding to radians r |
encrypt(s1, s2) | Password encrypt string s1 using salt s2 |
exp(p) | ep |
floor(n) | Floor of n, largest integer ≤ n |
format(n, d) | Format n with d decimal places |
hour(t) | Hour of time t |
if(b, v1, v2) | If b is true, v1, else v2 |
isnull(e) | 1 (true) if e is null, 0 (false) otherwise |
log(n) | Natural logarithm of n |
log10(n) | Common logarithm of n |
minute(t) | Minute of time t |
mod(n1, n2) | n1 % n2 |
month(d) | Number of month for date d |
monthname(d) | Name of month for date d |
now() | Current date and time |
password(s) | Password-encrypted string s |
pi() | Pi (π) |
pow(n, p) | np |
radians(d) | Radians corresponding to degrees d |
rand([seed]) | Random floating point number between 0 and 1; optional seed seed |
round(n [, d]) | n rounded to d decimal places or to an integer if no d |
second(t) | Seconds of time t |
sin(r) | Sine of r, which is in radians |
sqrt(n) | Square root of n |
tan(r) | Tangent of r, which is in radians |
truncate(n, d) | n truncated to d decimals |
user() | Current user |
week(d) | Week of year for date d |
year(d) | Year for date d |
select MushroomId, Pi() * Pow(radius, 2) from mushroom;
The area is computed individually for each value of radius.
An alternative to Pow(radius, 2) is to multiply, radius *
radius. As this example illustrates, we can use the arithmetic operators
+, -, *, /, and % as well as the SQL functions.
|
Quick Review Question 11 Complete
the select statement to display the estimated circumference (2πr)
of each mushroom cap. Use one blank between each operator and operand.
|
|
Quick Review Question 12 The
escape velocity of an object from a planet is ![]() |
We can also create an index at the same time we form a table, such as follows:create index BloodType_index on CurrentBioStudy (BloodType);
As the example illustrates, any indexed column must have the not null modifier to indicate that a value in such a field cannot be null.create table CurrentBioStudy ( subject char(10) not null primary key, RhFactor enum(+, -), BloodType enum(M, N, MN) not null, index BloodType_index (BloodType) );
MySQL generates an index on RhFactor and BloodType together and on RhFactor by itself.create index RhBloodType_index on CurrentBioStudy (RhFactor, BloodType);
Indices create a time-space trade-off. An index makes searches faster but consumes space. If we know that it is unlikely that we will perform further searching on an indexed attribute, we can drop the index to free the space and to eliminate the time involved in indexing new tuples.drop index RhBloodType_index;
|
Quick Review Question 13 The
primary key of LabSection from Quick
Review Question 2 is the collection of attributes CatalogNumber,
SecYear, semester, section. Check all indices that
are automatically created by declaring this collection a primary key: |
|
Quick Review Question 14 Suppose
we also frequently search on the DragExpWeek column of table LabSection.
Create an index for that column.
|
mysqldump -u root -p dbpass PsyDB > PsyDBBkup.sql
PsyDBBkup.sql, which begins as follows, contains SQL statements to recreate every table and its data:
# MySQL dump 5.13 # # Host: localhost Database: PsyDB #-------------------------------------------------------- # Server version 3.22.22 # # Table structure for table 'PsyEvent' # CREATE TABLE PsyEvent ( EventCode int(11) DEFAULT '0' NOT NULL, event char(30), PRIMARY KEY (EventCode) ); # # Dumping data for table 'PsyEvent' # INSERT INTO PsyEvent VALUES (1,'\'Left press\''); INSERT INTO PsyEvent VALUES (2,'\'Right press\''); INSERT INTO PsyEvent VALUES (3,'\'Turn on all panel lights\''); ... To backup the table PsyExp from that database, we give the database and table names after mysqldump, as follows:
To recreate the database or table, we can execute the commands in the file at the operating system prompt. If PsyDB exists, but the PsyExp no longer does, we can recreate the table and its data using redirection in UNIX, as follows:mysqldump -u root -p dbpass PsyDB PsyExp > PsyExp.sql
Because the backup files are in ASCII format, we can also search for data with which to perform a partial backup.mysql -u root -p dbpass PsyDB < PsyExp.sql
|
Quick Review Question 15 From
where do we use the mysqldump utility?
|
|
Quick Review Question 16
|
The directory PsyDir contains two files for each PsyDB table. One file having the name of the table with the extension .sql contains the SQL statement to create, but not to load, the table. Another file with the extension .txt contains the tab-deliminted data from the file. For example, the following file, PsyExp.sql, has the create statement for table PsyExp:mysqldump -u root -p dbpass -T PsyDir PsyDB
# MySQL dump 5.13 # # Host: localhost Database: PsyDB #-------------------------------------------------------- # Server version 3.22.22 # # Table structure for table 'PsyExp' # CREATE TABLE PsyExp ( EventTime float(10,2) DEFAULT '0.00' NOT NULL, EventCode int(11) DEFAULT '0' NOT NULL, ExpNumber int(11) DEFAULT '0' NOT NULL, PRIMARY KEY (EventTime) ); The file PsyExp.txt of tab-delimited data follows:
This file can be input data for another table, a different database, or a spreadsheet. In MySQL, we can use the load statement to insert the data into a table, as follows:1.22 2 3 2.71 5 3 3.02 1 3 5.66 2 3 5.91 5 3 7.55 2 3 8.00 10 3
The database administrator can also have the system generate an update log file by launching the mysql database server with the --log-update option.load data local infile 'PsyExp.txt' into table PsyExp;
1. Project 1 | 2. Project 2 |
3. Project 3 | 4. Project 4 |
Copyright © 2002, Dr. Angela B. Shiflet
All rights reserved