A mini-tutorial: | <URL:http://www.webthing.com/tutorials/login.html> |
Copyright © Nick Kew, 1997. This document may be reproduced in its entirity, but may not be modified without explicit permission.
AbstractFor largely historical reasons, the concept of login is not built in to the Web, and is poorly supported. Implementing a system supporting login is harder than it might at first appear. In CGI, it is a uniquely hard task, because (for security reasons) the authentication information is explicitly excluded. This tutorial gives an overview of login methods for secure [1] Web-based applications, followed by a more detailed description of how HTTP Authentication works, and describe (with code extracts) a system using HTTP Authentication with CGI in a portable manner, to implement a complex system of dynamic protections for a Web-based FileServer. We conclude with a few exercises, designed to consolidate the reader's understanding of the subject. |
HTTP is a stateless protocol, and login implies maintenance of state information, which must therefore be added on top of it. There are two main mechanisms [2,3] for this:
In terms of HTTP, these are very similar: both work by passing an additional header that contains state information. However, they are less similar to work with, and each has its own advantages and drawbacks.
It is my considered opinion that for any serious Web application, this choice should be dictated by the impression you wish to present to the users of the system:
The chief advantage of cookies is flexibility: you are handling the whole process yourself and have more control. The ability to set a persistent cookie and save the user having to re-authenticate each time she logs in may be a major advantage in some cases, but is inherently insecure if, for example, more than one user might be sharing a browser.
A program may be written to work with either option. A construct I have used in a number of systems is represented by the Perl code:
sub getuser {
my $defaultuser = shift ;
$ENV{'REMOTE_USER'} || &cookie_user || $defaultuser || &authenticate ;
}
This little example summarises the difference in the approaches:
We'll return to a variant on this construct later.
In summary:
As with any programming task, there's more than one way to do it. You can customise a webserver - either directly or using an API if one is provided - or you can use CGI for a portable solution. This is of course purely an implementor's decision, and does not affect your users.
HTTP Authentication is built into HTTP Servers (and of course browsers). The underlying mechanism is:
From the above, we see that HTTP Authentication not merely supports, but is a simple login scheme. Setting it up is a server configuration issue, and is completely transparent to CGI, just as it is to a static document - be it HTML or any other media type. You will need to consult your server manuals for details of how to configure protection, but the key point to remember is that your CGI program will only ever run when the User has already been authenticated by the Server.
If you are serving static documents, or indeed dynamic documents whose permissions can be determined in advance (e.g certain specified user(s) or group(s) are always permitted), you generally can and should use the Server's configuration in preference to CGI.
The User's identity (the username entered in the browser dialogue box) is available to CGI in the REMOTE_USER environment variable. No other information is available to CGI[6], due to security risks (although 'extended' CGI-like tools may sometimes give you this information anyway).
Hence what HTTP Authentication gives you automatically is:
For many purposes, this alone is perfectly adequate. However, what it does not provide for includes:
The first two of these are in fact impossible [7], and must be worked around using either server configuration or CGI (as before, server configuration should be preferred if it can do what you need). The third can be accomplished with CGI, but has no non-programming alternatives.
HTTP has no provision to cancel a user's credentials, and there is no general[8] way to do so. The workaround is to overwrite the user's credentials with those of another valid user at your site. Create a valid but unprivileged user ID, and a Logout URL which is permitted only to this user. This URL is now a logout button. This of course still leaves you the human task of persuading your users to use it.
It is not possible[7] to provide open and authenticated access to a single URL. However, it is entirely possible to offer mixed access to a document or program, by mapping a protected and unprotected URLs to the same document or program. In the case of a program, it may of course behave differently according to the value of REMOTE_USER (which is set only when access is authenticated).
I'm not even going to try and talk generalities about this. Instead I'll outline a working system.
The File Manager component of the Virtual Desktop at <URL:http://www.webthing.com/> has a fairly complex dynamic authentication requirement, requiring permissions to be computed from a database. The authentication function is required to determine:
Having cross-referenced these, it must either allow the attempted operation, or permit the user to re-authenticate (if access is denied, it may be available to the user under a different userid, so the user is immediately invited to re-login).
The first decision was to use HTTP Authentication, for the reasons already described. To do this, I protected the /desk/ URL under which the file manager resides, using a .htaccess[9] file:
AuthType Basic
AuthName WebThing
AuthDBMUserFile /my/path/to/passwdfile
AuthDBMGroupFile /my/path/to/passwdfile
require valid-user
When the server receives a request for a URL under the directory protected by this configuration file, it will:
Note that the server is permitting any valid user to access any desktop file: the more complex task of dynamic protections is handled by CGI. However, the Server has done the first crucial part of the work for us, by determining the identity of the user, and the rest is mere bookkeeping.
The core of the CGI authentication is the authenticate method of the CGI++ Library (<URL:http://www.webthing.com/cgiplusplus/>). Here it is in full:
void CGI::authenticate(
const char* authtype,
const char* realm,
void callback (const int) = 0
) const {
cout << "Status: 401 Authentication Required\n"
"WWW-Authenticate: " << authtype
<<"; realm=" << realm << "\n" ;
if ( callback )
callback(401) ;
else
cout
<< "Content-type: text/plain\n"
"\nPlease enter your username and password to access this document."
;
exit(0) ;
}
The first two arguments to this are the same as the AuthType and AuthName directives from the Apache configuration file we saw earlier, and the first two lines output by CGI::authenticate() are equivalent (though not identical - this is not NPH-CGI) to the authentication challenge issued by the Server when no credentials were supplied. Specifically:
The rest of CGI::authenticate deals with printing a customised error document. Since CGI++ is a library, it has no knowledge of the application, and what kind of error document would be appropriate, so it permits the caller to supply a callback function for this. If no callback is supplied, CGI++ itself prints a minimalist 1-line errordoc.
Note that CGI error documents can only ever be seen by a logged in user attempting an unauthorised operation, since the CGI won't run in the first place until the user has authenticated with the Server.
With these basic building blocks in place, the complex authentication task has been reduced to mere bookkeeping, of the kind familiar to every programmer. In pseudo-code outline:
if ( remote_user == owner )
PASS ; // I'm accessing my own data
else
// look up the level of access required to perform the required
// operation on the specified data
switch (level = protection_of(required_op, specified_data), level) {
case Public: PASS ; // anyone can do this
case Private: FAIL ; // we've already dealt with the owner.
default:
// look up user's authorised level of access to owner's desktop
if ( workgroups(owner).auth_level(remote_user) >= level )
PASS ;
else
FAIL ;
}
if ( PASS )
do_what_i_asked ; // authorized - do what was wanted
else
// present the user with an authentication dialogue - permit re-login
cgi.authenticate("Basic", "WebThing", errorfunc) ;
Exercises 1-8 should use only your Server's configuration file(s) - CGI is not required. For the remainder, use your choice of CGI programming language.
For the following, set up a protected directory permitting any authenticated user. You will be using CGI to control access.
Guru exercise: find TWO reasons why HTTP authentication is more secure than an equivalently-encrypted cookie, and TWO reasons why the reverse is the case. Assume competent implementation in both cases (if you can improve on two, I'd like to know).