_Web-server_ Collection
=======================

Launchers
=========

The _Web_ _server_ collection provides two launchers.  The "web-server" launcher
starts the Web server.  The "web-server-monitor" launcher monitors a Web
server by periodically sending it requests.  Both launchers support the
-h or --help flags which describe other available options.

The command line 
  web-server [-p <port>] [-f <configuration-table-file>] [-a <ip-address>]
starts the server on port 80 or <port> and uses the configuration options
from the "configuration-table" file of the web-server collection or from
the specified configuration file.  If <ip-address> is provided, the server
accepts connections only from <ip-address>.

The launcher web-server-text is the same as the web-server launcher,
except that it can not load servlets written using DrScheme's graphical XML
boxes, but it uses less memory.  It also works with the mzscheme-only
distribution of PLT-scheme.

The command line
  web-server-monitor [-p <port>]
                     [-f <frequency>]
                     [-t <timeout>]
		     <alert-email> <host-name>
polls any Web server running on host <host-name> at port <port> (or port 80)
every <frequency> seconds (or 1 hour).  If the server does not respond to
a HEAD HTTP request for the homepage within <timeout> (or 75) seconds or
sends an error response, the monitor will notify <alert-email> of the problem.

Another Start
=============

Requiring the library _web-server.ss_, via 
  (require (lib "web-server.ss" "web-server"))
provides the serve function, which starts the server with more configuration
options.

> serve : configuration [nat] [str or #f] -> (-> void)
  (define (serve configuration port ip-address) ...)
  
  The serve function starts the Web server, just like the launcher does,
  but the configuration argument supplies the server's settings.  The
  optional port argument overrides the port supplied by the configuration.
  The optional ip-address restricts accepted web requests to come only from
  that address.

  The result of invoking serve is a function of no arguments that shuts down
  the server.

  A later section describes the remaining semi-internal functions in the
  web-sever.ss library.

Constructing configurations requires another library.
  (require (lib "configuration.ss" "web-server"))

> load-configuration : str -> configuration
  
  This function accepts a path to a configuration file and returns a
  configuration that serve accepts.  The configuration servlet creates
  configuration files.


Serving Content
===============

By default, the Web server serves files out of the directory
(build-path (collection-path "web-server") "default-web-root" "htdocs"),
unless the function serve was called with a virtual-hosts argument or
the configuration tool modified the location.

The Web server also generates HTTP responses dynamically, based on files in a
special directory.  By default, the special directory  is named "servlets"
within the "default-web-root" of the "web-server" collection directory. 
Instead of serving the files in this directory verbatim, the server evaluates
the contained Scheme _servlet_ and serves the output.  A servlet is a unit/sig
that imports the servlet^ signature and exports nothing.  (Search in help-desk
for more information on unit/sig and on signatures.)  To construct a unit/sig
with the appropriate imports, the servlet must require the two modules
providing unit/sigs and the servlet^ signature:

(require (lib "unitsig.ss")
         (lib "servlet-sig.ss" "web-server"))
(unit/sig ()
  (import servlet^)
  ...insert servlet code here...)

The last value in the unit/sig must be a response to an HTTP request.
A Response is one of the following:

 - an X-expression representing HTML
   (Search for XML in help-desk.)

 - a (listof string) where
   - The first string is the mime type
     (often "text/html", but see RFC 2822 for other options).
   - The rest of the strings provide the document's content.

>  (make-response/full code message seconds mime extras body) where
   - code is a natural number indicating the HTTP response code
   - message is a string describing the code to a human
   - seconds is a natural number indicating the time the resource was created.
                Use (current-seconds) for dynamically created responses.
   - mime is a string indicating the response type.
   - extras is an environment containing extra headers for
               redirects, authentication, or cookies.
     an environment is a (listof (cons symbol string))
   - body is a (listof string)

>  (make-response/incremental code message seconds mime extras gen) where
   - code, message, seconds, mime, and extras are all the same as for
     make-response/full
   - gen : (string -> void) -> void
   The function gen consumes an output function.  The output function
   consumes a string and sends it to the client.  For HTTP/1.1 clients,
   the server uses the chunked encoding, which is reliable.  HTTP/1.0
   clients, however, can not distinguish between the end of the document and
   a lost connection.  These facts have two implications.  First,
   it is more efficient to send fewer, larger strings.  Second, this
   response should not be used for data that must arrive reliably.

   Also see make-html-response/incremental in the servlet-helpers section below.
   

Evaluating (require (lib "servlet-sig.ss" "web-server")) loads
the servlet^ signature consisting of the following imports:
  - initial-request : request, where a request is
    (make-request method uri headers bindings host-ip client-ip), where
    - method : (Union 'get 'post)
    - uri : URL see the net collection in help-desk for details
    - headers : environment
		optional HTTP headers for this request
    - bindings : environment
		 name value pairs from the form submitted or the query part
                 of the URL.
    - bindings/raw : either a string or an environment
                     For file uploads bindings/raw is identical to bindings.
                     Otherwise, it contains the unparsed get or post data.

The path part of the URL supplies the file path to the servlet relative to the
"servlets" directory.  However, paths may also contain extra path components
that servlets may use as additional input.  For example all of the following
URLs refer to the same servlet:

  http://www.plt-scheme.org/servlets/my-servlet
  http://www.plt-scheme.org/servlets/my-servlet/extra
  http://www.plt-scheme.org/servlets/my-servlet/extra/directories

The above imports support handling a single input from a Web form. To ease the
development of more interactive servlets, the servlet^ signature also provides
the following functions:

> send/suspend : (str -> response) -> request

  The argument, a function that consumes a string, is given a URL that can be
  used in the document.  The argument function must produce a response
  corresponding to the document's body.  Requests to the given URL resume the
  computation at the point send/suspend was invoked.  Thus, the argument
  function normally produces an HTML form with the "action" attribute set to
  the provided URL.  The result of send/suspend represents the next request.

> send/finish : response -> void

  This provides a convenient way to report an error or otherwise produce
  a final response.  Once called, all URLs generated by send/suspend
  become invalid and the servlet terminates.  Calling send/finish allows the
  system to reclaim resources consumed by the servlet.

> adjust-timeout! : nat -> void

  The server will shutdown each instance of a servlet after an unspecified
  default amount of time since the last time the servlet handled a request.
  Calling adjust-timeout! allows the programmer to choose this number of
  seconds.  Larger numbers consume more resources while smaller numbers force
  the user to restart computations more often.

The servlet-helpers module, required with
  (require (lib "servlet-helpers.ss" "web-server"))
provides a few additional functions helpful for constructing servlets:

An environment is a (listof (cons symbol string)), as before.

> extract-binding/single : sym environment -> str
  This extracts a single value associated with sym in the form bindings.
  If multiple or zero values are associated with the name, it raises an
  exception.

> extract-bindings : sym environment -> (listof str)
  returns a list of values associated with the name sym.
  
> exists-binding? : sym environment -> bool
  returns if the name sym is bound in the environment.
  This is useful for checkboxes.

> extract-user-pass : environment -> (U #f (cons str str))
  (define (extract-user-pass headers) ...)
  Servlets may easily implement password based authentication by extracting
  password information from the HTTP headers.  The return value is either a
  pair consisting of the username and password from the headers or #f if no
  password was provided.  This only extracts the provided username and
  password.  The servlet must perform any desired checking.

> redirect-to : str [redirection-status] -> response
  constructs a reponse that redirects to the given url(str).
  The optional argument specifies which kind of redirection to perform:
    - permanently (Browsers should send future requests directly to this url.) 
    - temporarily (Browsers should send future requests to the original url.)
    - see-other (The redirection is not replacing the current url.)
  See the HTTP/1.1 specification for details on each kind of redirection.
  Permanently is the default redirection type.

> make-html-response/incremental
  : ((string-> void) -> void) -> response/incremental

  This fills in default values for make-response/incremental appropriate
  for html.

For small example servlets, look in the "examples" directory in
the "servlets" directory in the "default-web-root" of the web-server
collection.

Special URLs
============

The Web server caches passwords and servlets for performance reasons.
Requesting the URL
  http://my-host/conf/refresh-passwords
reloads the password file.  After updating a servlet, loading the URL
  http://my-host/conf/refresh-servlets
causes the server to reload each servlet on the next invocation.
This loses any per-servlet state (not per servlet instance state) computed
before the unit invocation.    

TeachPack
=========

Choose "Add TeachPack..." from DrScheme's "Language" menu and select the
  plt/teachpack/servlet.ss
teachpack.  This provides functions for writing servlets including
send/suspend and send/finish.  All the extra servlet helper functions for
extracting information from Web requests and building Web responses also
become available through the teachpack.

Choosing "Create Servlet..." from DrScheme's "Scheme" menu saves the
program in the definitions window as a servlet.  This adds all the necessary
code to produce a unit suitable for the Web server's "servlets" directory.
The servlet can use any selected teachpacks and language specific
constructs just as the program in the defintions window can.


File locations
==============

By default, the configuration tool organizes files containing the Web server's
configurations, documents, and servlets in one directory tree per virtual host.
The organization of files follows:

web-directory
  configuration-table
  default-web-root
    conf
      servlet-error.html
      forbidden.html
      servlet-refresh.html
      passwords-refresh.html
      not-found.html
      protocol-error.html
    htdocs
      Defaults
	index.html
	documentation
	  ...
    log
    passwords
    servlets
      configure.ss
  my-other-host
    conf ...
    htdocs ...
    log
    passwords
    servlets
      configure.ss
  still-another-host ...

Files may be relocated or shared between hosts by editing the boring details
for that host using the configuration tool.

Semi-Internal Functions
=======================

The following functions expose more of the Web server for use by the
development environment.  They are not intended for general use.
They may change at anytime or disappear entirely.

> server-loop : custodian tcp-listener config initial-timeout -> void
  where custodian is the parent custodian for servlets.
	tcp-listener is where requests arrive.
	config encapsulates most of the state of the server.
        initial-timeout is the number of seconds before timing out connections

> make-config : host-table script-table instance-table access-table -> config
  where
    config = (make-config host-table script-table instance-table access-table)
    host-table = str -> host
		 maps host names to hosts
    script-table = (hash-table-of sym script)
		 maps servlet names to servlet units
    script = (unit servlet^ -> response)
	     represents a servlet that is invoked on each request
    instance-table = (hash-table-of sym servlet-instance)
		     maps the path part of a URL to the running servlet
    access-table = (hash-table-of sym (str sym str -> (U #f str)))
		   maps host names to functions that accept a protection
		   domain, a user name, and a password and either 
		   return #f if access is not denied (i.e. is accessible)
		   or a string prompting for a particular password
		   (i.e. "Course Grades").
    servlet-instance =
      (make-servlet-instance
        nat channel (hash-table-of sym (continuation request)))
      The natural number counts the continuations suspended by this servlet.  
      The channel communicates HTTP requests from the connection thread to
      the suspended servlet.  The hash-table maps parameter parts of URLs
      to suspended continuations.

> add-new-instance : sym instance-table -> void
  This creates a new servlet-instance and installs it in the instance-table
  under the name specified by sym.

> gen-send/suspend : url sym instance-table (response -> void) (-> void)
                     (channel -> void)
                  -> (str -> response)
                  -> request
  (define (gen-send/suspend uri invoke-id instances output-page
                            update-time! update-channel!)
    ...)

  This produces a function like send/suspend : ((str -> response) -> request),
  customized for a particular instance of a servlet.  The uri must 
  refer to the servlet, which instances must map invoke-id to.  The
  output-page function is called to send responses to the Web browser
  (remotely via HTTP in the normal server, locally via some other means
   in the development environment).  gen-send/suspend calls update-time! 
  (to reset timeouts) upon each Web request.  update-channel! receives the
  channel used to send responses to the connection thread.
