CGI Perl Programs & Forms
The following are links to CGI references:
- NCSA's "The Common Gateway Interface"
- "CGI: The Common Gateway Interface for Server-side Processing"
- Yahoo's "Common Gateway Interface (CGI)
Introduction
A plain HTML document to produce a web page is static and does not change, say to respond to a database query; but a Common Gateway Interface (CGI) program is dynamic and executes in real-time.A CGI program can take a request for information from a database, which the user submits with an HTML form, query the database, perform any required processing, and return the results to a web page or a plain text document.The user visits our web site, and, as necessary, the CGI program runs on our machine. The CGI program can interface both with our web page and our database as well as perform additional processing.
Because the program runs on our computer, we must be conscious of security.Usually, a CGI program must be in the /cgi-bin directory, and the webmaster controls this directory, allowing only authorized programs to reside in this area.
For a CGI program, we can use any programming language that produces an executable file. C, C++, FORTRAN, Perl, Python, UNIX scripts, Visual BASIC, AppleScript, and TCL are common choices.We use Perl because of its wide use in scientific databases, such as genomic databases.
The current module discusses Perl CGI program interface to HTML forms, and next module,"CGI Programs and Databases", covers Perl CGI program access of MySQL databases.
Communication Methods
AsFigure 1 in the module "Web Forms for Database Queries" shows, CGI is the interface for passing information between the web server and the CGI program, which is in Perl or another programming language.CGI accomplishes the communication through four methods:
- Environment variables
- Command line(This method is not used very much and will not be discussed.)
- Standard input
- Standard output
The web server presents the user with a form that he or she completes and submits.Using standard input and environment variables, the server sends the information from the form to the CGI program.The program transmits SQL requests to the database; and the database returns the appropriate information, which the program processes and communicates to the web server using the standard output method.
Quick Review Question #1
Select the method(s) that CGI uses for communication.
CGI Output
For communication from a CGI program to the user, such as after accessing the database and performing any processing, the program writes results to a MIME (Multipurpose Internet Mail Extension) encoded standard output file, and the web server returns this file to the browser. On the web page "Atomic Weights and Isotopic Compositions of the Elements with Relative Atomic Masses," the user can specify output to be as a HTML Table, Pre-formatted ASCII, Table or Linearized ASCII Output.The latter two choices indicate an ASCII file, or text file, that is not a web page.In this case, the first line of output contains the content-type descriptor indicating plain text, as follows:
content-type: text/plain
Regardless of the content type, the second output line must be blank, so we display two newline characters, “\n\n”.Suppose in a Perl CGI program the variables $formula, $MolecularWt, $RegistryNum, and $ChemStruct have appropriate values from the database.The following code segment produces a text file displaying the data:
print “content-type: text/plain\n\n”;
print “Benzene\n\n”;
print “ * Formula $formula\n”;
print “ * Molecular Weight: $MolecularWt\n”;
print “ * CAS Registry Number: $RegistryNum\n”;
print “ * Chemical Structure: [$ChemStruct]\n”;
The text file that the browser displays is similar to the following:
Benzene
* Formula: C6H6
* Molecular Weight: 78.11
* CAS Registry Number: 71-43-2
* Chemical Structure: [C6H6]
For a web page, the first line of output contains a MIME content-type descriptor indicating an HTML document, as follows:
content-type: text/html
In the Perl program, we are careful to display a blank line after this output by using two newline characters, “\n\n”.Using the print function, we write HTML code to the standard output stream.Except for the content-type line, use of '\n' is optional but avoids having the HTML code output appear on one line.For the web page to have a paragraph or line break, we write the tag <p> or <br> to the output, as on the third-from-the bottom line in the following segment:
print "content-type: text/html \n\n";
print "<html><head>\n";
print "<title>Benzene Search Results</title>\n";
print "</head><body>\n";
print "<h1>Benzene</h1>\n";
print "<ul>\n";
print "<li>Formula: " << formula << endl;
print "<li>Molecular Weight: " << MolecularWt << endl;
print "<li>CAS Registry Number: " << RegistryNum << endl;
print "<li>Chemical Structure: [" << ChemStruct<< "]" << endl;
print "</ul>\n";
print "<center>If you have comments or questions,<br>\n”;
print "please contact us.</center>\n";
print "</body></html>\n";
The segment generates the following output:
content-type: text/html
<html>
<head>
<title>Benzene Search Results</title>
</head>
<body>
<h1>Benzene</h1>
<ul>
<li>Formula: C6H6
<li>Molecular Weight: 78.11
<li>CAS Registry Number: 71-43-2
<li>Chemical Structure: [C6H6]
</ul>
<center>If you have comments or questions,<br>
please contact us.</center>
</body>
</html>
Click here to view the resulting web page.
Quick Review Question #2
Have no blanks in your answer and use all lowercase.
CGI Input
We employ forms to enter data or query a database via the web.A CGI program must obtain this data before accessing the database.For example, using the "Atomic Weights and Isotopic Compositions of the Elements with Relative Atomic Masses," suppose we type "Li" for the atomic symbol of lithium, choose "HTML Table," and click "Get Data".(Page "Atomic Weights and Isotopic Compositions of the Elements with Relative Atomic Masses" is derived from one athttp://physics.nist.gov/PhysRefData/Compositions/index.html.)With the form method get, the URL for the result is as follows:
http://physics.nist.gov/cgi-bin/Compositions/stand_alone.pl?ele=Li&ascii=html
After the question mark (?), the CGI query string contains a list of name-value pairs.For example, in the "Atomic Symbol or Number" text box we typed "Li", and HTML code reveals that the name of this text box is ele.Thus, the URL string contains "ele=Li".Similarly, the string also contains "ascii=html" because, we selected the button with value "html" from the ascii radio button group.An ampersand (&) separates these name-value pairs.
If the form method for submission were post instead of get, the question mark and CGI query string would not appear in the URL, thus simplifying the URL and avoiding URL length restrictions from the browser.With post, a second step occurs in which, invisible to the user, the server passes the query string to the CGI program in a standard input file.
A query string, such as "ele=Li&ascii=html", is encoded in standard URL format in which a blank is replaced by a plus (+) and a non-alphanumeric special character, such as the slash (/), is replaced by a percent sign (%) and its ASCII code in hexadecimal (base 16) representation, such as %2F.Thus, the string "val = x + 85.2" with blanks around the equals mark and plus sign is encoded as "val+%3D+x+%2B+85.2".In the encoded string, four pluses replace the blanks; 3D is the hexadecimal ASCII code for '='; and 2B is the code for '+'.The character-by-character encoding is as follows:
String |
v |
a |
l |
|
= |
|
x |
|
+ |
|
8 |
5 |
. |
2 |
Encoded string |
v |
a |
l |
+ |
%3D |
+ |
x |
+ |
%2B |
+ |
8 |
5 |
. |
2 |
The CGI program must decode the encoded string before processing further.Fortunately, libraries exist for performing this task in several languages.However, for generality and to illustrate the process, we use Perl to decode the query string.
Quick Review Question #3
Process a Query String
Suppose the variable $posted_information stores a query string.The following lines of Perl decode the query string from its hexadecimal representation back into ASCII:
$posted_information =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack(“C”, hex($1))/eg;
$posted_information =~ s/\+/ /g;
The first line of code looks for a string that contains a percent sign and two hexadecimal numbers.Any encoded characters that are found are decoded using the pack command.The “C” argument indicates that the value should be converted to an ASCII character.The second line replaces a + sign with a space.
Now that the query string is decoded, we use the split function to divide key-value pairs.The following line breaks up each &-separated key-value pair and puts each component into an array, called @fields:
@fields = split(/&/, $posted_information);
The first argument, which is between forward slashes, is the character(s) to separate values in the variable.For instance, an argument of /:/ would separate the string argument by colons.The @fields variable can be indexed like an array in C++, where $fields[0] is the first value in the array, $fields[1], the second, and so forth.Note that when accessing the individual elements of an array, the variable name is preceded by a dollar sign ‘$’.
Suppose that the variable $posted_information contains an email address and a telephone number submitted from a form, so that $posted_information looks as follows after decoding from hexadecimal:
$posted_information = email=name@yahoo.com&number=123-4567
After calling the function split as above, $fields[0] contains the value “email=name@yahoo.com” and $fields[1] has “number=123-4567”.By calling split again, we obtain the desired information.
($label, $email_address) = split(/=/, $fields[0]);
($label, $phone_number) = split(/=/, $fields[1]);
For this example, we separate values by an equals sign.
The function split returns an array, or list.Sometimes we know the number of fields split returns.For example, with email=name@yahoo.com, the function returns two fields.In this case, instead of using an arbitrary array name, we can give our own list, and Perl will assign values to each individual variable based on what split returns.Consequently, after execution of the above code, $email_address contains the value “name@yahoo.com” and $phone_number contains the value “123-4567”.
Environment Variables
The server sets seventeen environment variables that a CGI program can access.When a CGI program is called, the environment variables are available to the program.For example, for the method get, the environment variable QUERY_STRING has as its value the query string, or the string after the question mark. The value of QUERY_STRING is "ele=Li&ascii=html" in the following example:
http://physics.nist.gov/cgi-bin/Compositions/stand_alone.pl?ele=Li&ascii=html
With the associative array %ENV, we can determine the character string values of any of the environment variables.For example, to obtain the query string when the method is get, we use $ENV with the index "QUERY_STRING", such as follows in Perl:
$QueryString = $ENV{"QUERY_STRING"};
We can determine the query method using the index "REQUEST_METHOD", as follows:
$RequestMethod = $ENV{"REQUEST_METHOD"};
With a post request, in the CGI's program standard input, the server sends the form's data but not necessarily the end-of-file marker, a special symbol indicating the end of the file.Thus, to process the correct amount of input, we use the value of the environment variable CONTENT_LENGTH, which contains the length of the query string.In the following statement we obtain the value of this character string context variable:
$ContentLength = $ENV{"CONTENT_LENGTH"};
Knowing that the request method is post and the length of the query string, we can read the characters into a string variable $QueryString, as follows:
read(STDIN, $QueryString, $ContentLength);
Exercises
1. a. Write a Perl segment to generate a plain text output with a greeting to the user.
b. Repeat Part a, generating HTML code output.
Projects
1. Create a web page with a text box for the user's name and two radio buttons indicating plain text or html output.Develop a Perl script to generate a plain text output file or a web page, depending on the selected radio button.Each output file should contain a greeting to the user
2. Create a web page that enables the user to type a binary or hexadecimal number in a text box and to have a Perl script return the corresponding decimal number.Have a pair of radio buttons to indicate whether the original number is in base 2 or 16.
3. Create a web page with Perl script to perform a temperature conversion between the Fahrenheit and Celsius systems. Enable the user to type a number and to indicate the kind of conversion to perform.Have the answer also appear on a web page.The following formulas convert a temperature in Fahrenheit (F) to its equivalent in Celsius (C) and vice versa:
4. The following is a MySQL statement to create the table ecs_spectra, which you can also download here:
create table ecs_spectra (
spec varchar(10),
wavelength varchar(10),
Rel_int varchar(10),
Aki float,
Acc char(2),
Ei varchar(10),
Ej varchar(10),
Configurationsvarchar(15),
Terms char(6),
Ji char(3),
Jk char(3),
Gi integer,
Gk integer,
Type char(2),
Tp_refs varchar(10),
Line_refs varchar(10)
);
Table 1 of "Accessing Atomic Spectra Database Assignment" from Project 1 of "Accessing Databases with SQL" explains the meanings of the fields.The data type varchar(n) is the type of a variable length string of at most n characters, while char(n) is the type of a string of exactly n characters.The structure of the table ecs_spectra was derived by Dr. Orlando Karam from the NIST Atomic Spectra Database.
Create a web page with an HTML form to allow the user to specify values of certain fields and to indicate the desired data.Develop a Perl script to access the web page and generate a text document thanking the user and displaying the information from the request.(After the module on "CGI Programs and Databases," we can return the desired data to the user.)
5. The following is a MySQL statement to create the table ecs_sf_sites, which you can also download here:
create table ecs_sf_sites (
id char(12) primary key,
site_name varchar(255),
street_addr varchar(255),
city varchar(255),
state varchar(255),
zip varchar(255),
county varchar(255),
site_smsa varchar(255),
fed_facil char(1),
npl_stat char(1),
corp_link varchar(255),
rod_link varchar(255),
latitude float,
longitude float,
ownership varchar(255),
site_incident varchar(255)
);
The data type varchar(n) is the type of a variable length string of at most n characters, while char(n) is the type of a string of exactly n characters.The structure of the table ecs_sf_sites was derived Database by Dr. Orlando Karam from the EPA's Superfund (CERCLIS).
Create a web page with an HTML form to allow the user to specify values of certain fields and to indicate the desired data.Develop a Perl script to access the web page and generate a text document thanking the user and displaying the information from the request.(After the module on "CGI Programs and Databases," we can return the desired data to the user.)
6. The following are MySQL statements to create the tables FixedType and ClassCodeRef, which you can also download here:
create table FixedType (
constellation char(4) not null,
ObjectName varchar(30) not null,
ClassCode enum('D', 'S', 'V'),
SpectralClass char(2),
hours int,
minutes int,
TimeSec int,
degrees int,
AngleSec int,
magnitude float,
primary key(constellation, ObjectName)
);
create table ClassCodeRef (
ClassCode enum('D', 'S', 'V') not null primary key,
ClassCodeName varchar(255)
);
Table 1 of "Accessing Star Database Assignment" fromProject 3 of "Accessing Databases with SQL" explains the meanings of the fields.The data type varchar(n) is the type of a variable length string of at most n characters, while char(n) is the type of a string of exactly n characters.(The structure was derived by Dr. Orlando Karam from star.dat by Dr. Dan Welch (seeProject 5 of "Introduction to Databases").)
Create a web page with an HTML form to allow the user to specify values of certain fields and to indicate the desired data.Develop a Perl script to access the web page and generate a text document thanking the user and displaying the information from the request.(After the module on"CGI Programs and Databases," we can return the desired data to the user.)