Section 4: Troubleshooting a CGI application

Since this subject is quite well covered by other documents, this FAQ has
relatively little to say. 

Eric Wienke has a page "Debugging CGI Scripts 101" at
http://www.liquidsilver.com/scripts/debug101.html

Tom Christiansen's "Idiot's guide to solving Perl/CGI problems" is a
slightly tongue-in-cheek list of common problems, and how to track
them down.  Much of what Tom covers is not specifically Perl, but
applies equally to CGI programming in other languages. 

Marc Hedlund's CGI FAQ and Thomas Boutell's WWW FAQ also
deal with this subject. 

See "Further Reading" below (if you don't already know where to find these
documents). 

4.1: Are there some interactive debugging tools and services available?

(1) Several CGI programming libraries offer powerful interactive
    debugging facilities.   These include:

	- for Perl, Lincoln Stein's CGI.pm
	(now part of the standard Perl distribution)

	- for Tcl, Don Libes' cgi.tcl
	http://expect.nist.gov/cgi.tcl

	- for C++, Nick Kew's CGI++
	http://www.webthing.com/cgiplusplus/

(2) Nathan Neulinger's cgiwrap is another package with debugging aids.
http://www.umr.edu/~cgiwrap/

(3) The "mod_cgi" Apache module (new with Apache 1.2) enables you to
capture script output and errors for diagnosis.

See also the next question.

[Table of Contents] [Index]

4.2: I'm having trouble with my headers. What can I do?

For simple cases, examining your response headers "by hand" may suffice:
(1) telnet to the host and port where the server is running - e.g.
        telnet www.myhost.com 80
(2) Enter HTTP request.   The most useful for this purpose is usually HEAD; eg
        HEAD /index.html HTTP/1.0
        (optional HTTP headers)
        (followed by a blank line)
Now you'll get a full HTTP response header back.

For complex cases, such as sending a request with headers (as a browser
does) or POSTing a form, this author's free online diagnosis cg-eye is
included in the respective toolkits at
     http://www.htmlhelp.org/tools/
     http://www.webthing.com/valet/
	This combines an offline cgi "linter" with two online services:
	(a) Interactive mode permits you to formulate an HTTP request,
	which is then sent to your server.
	(b) Live mode submits your form, exactly as it gets it from your
	browser.
	In both cases, it will print a detailed report of the transaction,
	and optionally (if the CGI is producing an HTML page) validate it.

[Table of Contents] [Index]

4.3: Why do I get Error 500 ("the script misbehaved", or "Internal Server Error")

Your script must follow the CGI interface, which requires it to print:
(1) One or more Header lines.
(2) A blank line
(3) (optional, but strongly advised) a document body.

This error means it didn't.

The Header lines can include anything that's valid under HTTP, but must
normally include at least one of the three special CGI headers:
	Content-Type
	Location
	Status

Example (a very minimal HTML page via CGI)
Content-Type: text/html			<= Header
					<= Blank Line
<title>HelloWorld</title>Hello World	<= Document Body

A common reason for a script to fail is that it crashed before printing
the header and blank line (or while these are buffered).  Or that it
didn't run at all: you _did_ try it from the commandline as well as
check the file permissions and server configuration, didn't you?

Another possible reason is that it printed something else - like an
error message - in the Headers.   Check error logs, put a dummy header
right at the top (for debugging only), check the "Idiot's Guide",
and use the debug mode of your CGI library.

[Table of Contents] [Index]

4.4: I tried to use (Content-Type|Location|whatever), but it appears in my Browser?

That means you put the line in the wrong place.  It must appear in the
CGI Header, not the document body.  See previous question.

It's also possible that you didn't print a header at all, or had a blank
line or other noise before or in the header, but that the HTTPD has
corrected this error for you (servers which correct your errors may give
rise to the "works on A not on B" phenomenon).   See previous question.

[Table of Contents] [Index]

4.5: How can I run my CGI program 'live' in a debugger?

David S. Jackson offers the following tip:

> I have a very good trick for debugging CGIs written in C/C++ running on
> UNIX. You might want to add it to the debugging section of your CGI faq.
> 
> First, in your CGI code, at it's start, add "sleep(30);". This will cause
> the CGI to do nothing for thiry seconds (you may need to adjust this
> time). Compile the CGI with debuging info ("-g" in gcc) and install the
> CGI as normal. Next, using your web browser, activate the CGI. It will of
> course just sit there doing nothing. While it is 'sleeping', find it's PID
> (ps -a | grep <cgi name>). Load your debugger and attach to that PID
> ("attach <pid>" in gdb). You will also need to tell it where to find the
> symbol definitions ("symbol-file <cgi>" in gdb). Then set a break point
> after the invocation of the sleep function and you are ready to debug. Do
> be aware that your browser will eventually timeout if it doesn't recieve
> anything.

(Anyone know similar tricks for scripting languages)?

[Table of Contents] [Index]

4.6: I'm using CGI with QUERY_STRING embedded in my HTML, but it gets corrupted?

The problem is the & character, which has two separate special meanings:
- In HTTP (and hence CGI) it is a separator in your QUERY_STRING
- In HTML it is an escape character

So when it appears in an HTML context, it should be encoded.  If you need
a link to myprog.cgi with QUERY_STRING "a=1&b=2" you should write
<a href="myprog.cgi?a=1&amp;b=2">my program</a>
which the browser's HTML parser will convert to what you wanted.

There are possible browser problems here, although they appear to be
limited to older browsers.  Some other approaches are:
- Use a different separator character in CGI programs when called in this
  manner.  Or even a completely different encoding.  This is safe, but may
  be much more work unless your CGI library supports setting a different
  separator character.
- Avoid any parameters whose names include that of any HTML entity.
  This runs a possible risk if the set of entities changes in future,
  or when browsers introduce proprietary 'extensions'.

[Table of Contents] [Index]