[svn] New option --random-wait.

[wget] / doc / wget.texi
diff --git a/doc/wget.texi b/doc/wget.texi

index 68a74cbfa1686e3e451d90a3eea41485c63c56ca..9b964fde258c3ff92ec4a19d433aed23b0f77ccd 100644 (file)
--- a/doc/wget.texi
+++ b/doc/wget.texi
@@ -15,8 +15,8 @@
  @end iftex
  
  @c This should really be auto-generated!
-@set VERSION 1.5.3+dev
-@set UPDATED Feb 2000
+@set VERSION 1.8-dev
+@set UPDATED November 2001
  
  @dircategory Net Utilities
  @dircategory World Wide Web
@@ -28,7 +28,9 @@
  This file documents the the GNU Wget utility for downloading network
  data.
  
-Copyright (C) 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+@c man begin COPYRIGHT
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001 Free Software
+Foundation, Inc.
  
  Permission is granted to make and distribute verbatim copies of
  this manual provided the copyright notice and this permission notice
@@ -42,10 +44,12 @@ notice identical to this one except for the removal of this paragraph
  @end ignore
  Permission is granted to copy, distribute and/or modify this document
  under the terms of the GNU Free Documentation License, Version 1.1 or
-any later version published by the Free Software Foundation; with no
-Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
-Texts.  A copy of the license is included in the section entitled ``GNU
-Free Documentation License''.
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``GNU Free
+Documentation License'', with no Front-Cover Texts, and with no
+Back-Cover Texts.  A copy of the license is included in the section
+entitled ``GNU Free Documentation License''.
+@c man end
  @end ifinfo
  
  @titlepage
@@ -54,16 +58,27 @@ Free Documentation License''.
  @subtitle Updated for Wget @value{VERSION}, @value{UPDATED}
  @author by Hrvoje Nik@v{s}i@'{c} and the developers
  
+@ignore
+@c man begin AUTHOR
+Originally written by Hrvoje Niksic <hniksic@arsdigita.com>.
+@c man end
+@c man begin SEEALSO
+GNU Info entry for @file{wget}.
+@c man end
+@end ignore
+
  @page
  @vskip 0pt plus 1filll
-Copyright @copyright{} 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001 Free Software
+Foundation, Inc.
  
  Permission is granted to copy, distribute and/or modify this document
  under the terms of the GNU Free Documentation License, Version 1.1 or
-any later version published by the Free Software Foundation; with no
-Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
-Texts.  A copy of the license is included in the section entitled ``GNU
-Free Documentation License''.
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``GNU Free
+Documentation License'', with no Front-Cover Texts, and with no
+Back-Cover Texts.  A copy of the license is included in the section
+entitled ``GNU Free Documentation License''.
  @end titlepage
  
  @ifinfo
@@ -73,7 +88,8 @@ Free Documentation License''.
  This manual documents version @value{VERSION} of GNU Wget, the freely
  available utility for network download.
  
-Copyright @copyright{} 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001 Free Software
+Foundation, Inc.
  
  @menu
  * Overview::            Features of Wget.
@@ -95,6 +111,7 @@ Copyright @copyright{} 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
  @cindex overview
  @cindex features
  
+@c man begin DESCRIPTION
  GNU Wget is a freely available network utility to retrieve files from
  the World Wide Web, using @sc{http} (Hyper Text Transfer Protocol) and
  @sc{ftp} (File Transfer Protocol), the two most widely used Internet
@@ -108,8 +125,10 @@ while the user is not logged on.  This allows you to start a retrieval
  and disconnect from the system, letting Wget finish the work.  By
  contrast, most of the Web browsers require constant user's presence,
  which can be a great hindrance when transferring a lot of data.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  Wget is capable of descending recursively through the structure of
  @sc{html} documents and @sc{ftp} directory trees, making a local copy of
@@ -117,8 +136,10 @@ the directory hierarchy similar to the one on the remote server.  This
  feature can be used to mirror archives and home pages, or traverse the
  web in search of data, like a @sc{www} robot (@pxref{Robots}).  In that
  spirit, Wget understands the @code{norobots} convention.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  File name wildcard matching and recursive mirroring of directories are
  available when retrieving via @sc{ftp}.  Wget can read the time-stamp
@@ -127,49 +148,74 @@ locally.  Thus Wget can see if the remote file has changed since last
  retrieval, and automatically retrieve the new version if it has.  This
  makes Wget suitable for mirroring of @sc{ftp} sites, as well as home
  pages.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  Wget works exceedingly well on slow or unstable connections,
  retrying the document until it is fully retrieved, or until a
  user-specified retry count is surpassed.  It will try to resume the
  download from the point of interruption, using @code{REST} with @sc{ftp}
  and @code{Range} with @sc{http} servers that support them.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  By default, Wget supports proxy servers, which can lighten the network
  load, speed up retrieval and provide access behind firewalls.  However,
  if you are behind a firewall that requires that you use a socks style
-gateway, you can get the socks library and build wget with support for
+gateway, you can get the socks library and build Wget with support for
  socks.  Wget also supports the passive @sc{ftp} downloading as an
  option.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  Builtin features offer mechanisms to tune which links you wish to follow
  (@pxref{Following Links}).
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  The retrieval is conveniently traced with printing dots, each dot
  representing a fixed amount of data received (1KB by default).  These
  representations can be customized to your preferences.
+@c man end
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  Most of the features are fully configurable, either through command line
  options, or via the initialization file @file{.wgetrc} (@pxref{Startup
  File}).  Wget allows you to define @dfn{global} startup files
  (@file{/usr/local/etc/wgetrc} by default) for site settings.
+@c man end
+
+@ignore
+@c man begin FILES
+@table @samp
+@item /usr/local/etc/wgetrc
+Default location of the @dfn{global} startup file.
+
+@item .wgetrc
+User startup file.
+@end table
+@c man end
+@end ignore
  
  @sp 1
+@c man begin DESCRIPTION
  @item
  Finally, GNU Wget is free software.  This means that everyone may use
  it, redistribute it and/or modify it under the terms of the GNU General
  Public License, as published by the Free Software Foundation
  (@pxref{Copying}).
  @end itemize
+@c man end
  
  @node Invoking, Recursive Retrieval, Overview, Top
  @chapter Invoking
@@ -181,7 +227,9 @@ Public License, as published by the Free Software Foundation
  By default, Wget is very simple to invoke.  The basic syntax is:
  
  @example
+@c man begin SYNOPSIS
  wget [@var{option}]@dots{} [@var{URL}]@dots{}
+@c man end
  @end example
  
  Wget will simply download all the @sc{url}s specified on the command
@@ -282,7 +330,7 @@ with your favorite browser, like @code{Lynx} or @code{Netscape}.
  Since Wget uses GNU getopts to process its arguments, every option has a
  short form and a long form.  Long options are more convenient to
  remember, but take time to type.  You may freely mix different option
-styles, or specify options after the command-line arguments. Thus you
+styles, or specify options after the command-line arguments.  Thus you
  may write:
  
  @example
@@ -325,6 +373,8 @@ and @file{/~somebody}.  You can also clear the lists in @file{.wgetrc}
  wget -X '' -X /~nobody,/~somebody
  @end example
  
+@c man begin OPTIONS
+
  @node Basic Startup Options, Logging and Input File Options, Option Syntax, Invoking
  @section Basic Startup Options
  
@@ -464,29 +514,30 @@ automatically sets the number of tries to 1.
  @cindex no-clobber
  @item -nc
  @itemx --no-clobber
-If a file is downloaded more than once in the same directory, wget's
+If a file is downloaded more than once in the same directory, Wget's
  behavior depends on a few options, including @samp{-nc}.  In certain
-cases, the local file will be "clobbered", or overwritten, upon repeated
-download.  In other cases it will be preserved.
+cases, the local file will be @dfn{clobbered}, or overwritten, upon
+repeated download.  In other cases it will be preserved.
  
-When running wget without @samp{-N}, @samp{-nc}, or @samp{-r},
+When running Wget without @samp{-N}, @samp{-nc}, or @samp{-r},
  downloading the same file in the same directory will result in the
-original copy of @samp{@var{file}} being preserved and the second copy
-being named @samp{@var{file}.1}.  If that file is downloaded yet again,
-the third copy will be named @samp{@var{file}.2}, and so on.  When
-@samp{-nc} is specified, this behavior is suppressed, and wget will
+original copy of @var{file} being preserved and the second copy being
+named @samp{@var{file}.1}.  If that file is downloaded yet again, the
+third copy will be named @samp{@var{file}.2}, and so on.  When
+@samp{-nc} is specified, this behavior is suppressed, and Wget will
  refuse to download newer copies of @samp{@var{file}}.  Therefore,
-"no-clobber" is actually a misnomer in this mode -- it's not clobbering
-that's prevented (as the numeric suffixes were already preventing
-clobbering), but rather the multiple version saving that's prevented.
+``@code{no-clobber}'' is actually a misnomer in this mode---it's not
+clobbering that's prevented (as the numeric suffixes were already
+preventing clobbering), but rather the multiple version saving that's
+prevented.
  
-When running wget with @samp{-r}, but without @samp{-N} or @samp{-nc},
+When running Wget with @samp{-r}, but without @samp{-N} or @samp{-nc},
  re-downloading a file will result in the new copy simply overwriting the
  old.  Adding @samp{-nc} will prevent this behavior, instead causing the
  original version to be preserved and any newer copies on the server to
  be ignored.
  
-When running wget with @samp{-N}, with or without @samp{-r}, the
+When running Wget with @samp{-N}, with or without @samp{-r}, the
  decision as to whether or not to download a newer copy of a file depends
  on the local and remote timestamp and size of the file
  (@pxref{Time-Stamping}).  @samp{-nc} may not be specified at the same
@@ -497,55 +548,94 @@ Note that when @samp{-nc} is specified, files with the suffixes
  and parsed as if they had been retrieved from the Web.
  
  @cindex continue retrieval
+@cindex incomplete downloads
+@cindex resume download
  @item -c
  @itemx --continue
-Continue getting an existing file.  This is useful when you want to
-finish up the download started by another program, or a previous
-instance of Wget.  Thus you can write:
+Continue getting a partially-downloaded file.  This is useful when you
+want to finish up a download started by a previous instance of Wget, or
+by another program.  For instance:
  
  @example
  wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
  @end example
  
-If there is a file name @file{ls-lR.Z} in the current directory, Wget
+If there is a file named @file{ls-lR.Z} in the current directory, Wget
  will assume that it is the first portion of the remote file, and will
-require the server to continue the retrieval from an offset equal to the
+ask the server to continue the retrieval from an offset equal to the
  length of the local file.
  
-Note that you need not specify this option if all you want is Wget to
-continue retrieving where it left off when the connection is lost---Wget
-does this by default.  You need this option only when you want to
-continue retrieval of a file already halfway retrieved, saved by another
-@sc{ftp} client, or left by Wget being killed.
-
-Without @samp{-c}, the previous example would just begin to download the
-remote file to @file{ls-lR.Z.1}.  The @samp{-c} option is also
-applicable for @sc{http} servers that support the @code{Range} header.
-
-Note that if you use @samp{-c} on a file that's already downloaded
-completely, @samp{@var{file}} will not be changed, nor will a second
-@samp{@var{file}.1} copy be created.
-
+Note that you don't need to specify this option if you just want the
+current invocation of Wget to retry downloading a file should the
+connection be lost midway through.  This is the default behavior.
+@samp{-c} only affects resumption of downloads started @emph{prior} to
+this invocation of Wget, and whose local files are still sitting around.
+
+Without @samp{-c}, the previous example would just download the remote
+file to @file{ls-lR.Z.1}, leaving the truncated @file{ls-lR.Z} file
+alone.
+
+Beginning with Wget 1.7, if you use @samp{-c} on a non-empty file, and
+it turns out that the server does not support continued downloading,
+Wget will refuse to start the download from scratch, which would
+effectively ruin existing contents.  If you really want the download to
+start from scratch, remove the file.
+
+Also beginning with Wget 1.7, if you use @samp{-c} on a file which is of
+equal size as the one on the server, Wget will refuse to download the
+file and print an explanatory message.  The same happens when the file
+is smaller on the server than locally (presumably because it was changed
+on the server since your last download attempt)---because ``continuing''
+is not meaningful, no download occurs.
+
+On the other side of the coin, while using @samp{-c}, any file that's
+bigger on the server than locally will be considered an incomplete
+download and only @code{(length(remote) - length(local))} bytes will be
+downloaded and tacked onto the end of the local file.  This behavior can
+be desirable in certain cases---for instance, you can use @samp{wget -c}
+to download just the new portion that's been appended to a data
+collection or log file.
+
+However, if the file is bigger on the server because it's been
+@emph{changed}, as opposed to just @emph{appended} to, you'll end up
+with a garbled file.  Wget has no way of verifying that the local file
+is really a valid prefix of the remote file.  You need to be especially
+careful of this when using @samp{-c} in conjunction with @samp{-r},
+since every file will be considered as an "incomplete download" candidate.
+
+Another instance where you'll get a garbled file if you try to use
+@samp{-c} is if you have a lame @sc{http} proxy that inserts a
+``transfer interrupted'' string into the local file.  In the future a
+``rollback'' option may be added to deal with this case.
+
+Note that @samp{-c} only works with @sc{ftp} servers and with @sc{http}
+servers that support the @code{Range} header.
+
+@cindex progress indicator
  @cindex dot style
-@cindex retrieval tracing style
-@item --dot-style=@var{style}
-Set the retrieval style to @var{style}.  Wget traces the retrieval of
-each document by printing dots on the screen, each dot representing a
-fixed amount of retrieved data.  Any number of dots may be separated in
-a @dfn{cluster}, to make counting easier.  This option allows you to
-choose one of the pre-defined styles, determining the number of bytes
-represented by a dot, the number of dots in a cluster, and the number of
-dots on the line.
-
-With the @code{default} style each dot represents 1K, there are ten dots
-in a cluster and 50 dots in a line.  The @code{binary} style has a more
-``computer''-like orientation---8K dots, 16-dots clusters and 48 dots
-per line (which makes for 384K lines).  The @code{mega} style is
-suitable for downloading very large files---each dot represents 64K
-retrieved, there are eight dots in a cluster, and 48 dots on each line
-(so each line contains 3M).  The @code{micro} style is exactly the
-reverse; it is suitable for downloading small files, with 128-byte dots,
-8 dots per cluster, and 48 dots (6K) per line.
+@item --progress=@var{type}
+Select the type of the progress indicator you wish to use.  Legal
+indicators are ``dot'' and ``bar''.
+
+The ``dot'' indicator is used by default.  It traces the retrieval by
+printing dots on the screen, each dot representing a fixed amount of
+downloaded data.
+
+When using the dotted retrieval, you may also set the @dfn{style} by
+specifying the type as @samp{dot:@var{style}}.  Different styles assign
+different meaning to one dot.  With the @code{default} style each dot
+represents 1K, there are ten dots in a cluster and 50 dots in a line.
+The @code{binary} style has a more ``computer''-like orientation---8K
+dots, 16-dots clusters and 48 dots per line (which makes for 384K
+lines).  The @code{mega} style is suitable for downloading very large
+files---each dot represents 64K retrieved, there are eight dots in a
+cluster, and 48 dots on each line (so each line contains 3M).
+
+Specifying @samp{--progress=bar} will draw a nice ASCII progress bar
+graphics (a.k.a ``thermometer'' display) to indicate retrieval.  If the
+output is not a TTY, this option will be ignored, and Wget will revert
+to the dot indicator.  If you want to force the bar indicator, use
+@samp{--progress=bar:force}.
  
  @item -N
  @itemx --timestamping
@@ -602,7 +692,7 @@ reasonably expect the network error to be fixed before the retry.
  @item --waitretry=@var{seconds}
  If you don't want Wget to wait between @emph{every} retrieval, but only
  between retries of failed downloads, you can use this option.  Wget will
-use "linear backoff", waiting 1 second after the first failure on a
+use @dfn{linear backoff}, waiting 1 second after the first failure on a
  given file, then waiting 2 seconds after the second failure on that
  file, up to the maximum number of @var{seconds} you specify.  Therefore,
  a value of 10 will actually make Wget wait up to (1 + 2 + ... + 10) = 55
@@ -611,10 +701,30 @@ seconds per file.
  Note that this option is turned on by default in the global
  @file{wgetrc} file.
  
+@cindex wait, random
+@cindex random wait
+@itemx --random-wait
+Some web sites may perform log analysis to identify retrieval programs
+such as Wget by looking for statistically significant similarities in
+the time between requests. This option causes the time between requests
+to vary between 0 and 2 * @var{wait} seconds, where @var{wait} was
+specified using the @samp{-w} or @samp{--wait} options, in order to mask
+Wget's presence from such analysis.
+
+A recent article in a publication devoted to development on a popular
+consumer platform provided code to perform this analysis on the fly.
+Its author suggested blocking at the class C address level to ensure
+automated retrieval programs were blocked despite changing DHCP-supplied
+addresses.
+
+The @samp{--random-wait} option was inspired by this ill-advised
+recommendation to block many unrelated users from a web site due to the
+actions of one.
+
  @cindex proxy
  @item -Y on/off
  @itemx --proxy=on/off
-Turn proxy support on or off. The proxy is on by default if the
+Turn proxy support on or off.  The proxy is on by default if the
  appropriate environmental variable is defined.
  
  @cindex quota
@@ -641,10 +751,10 @@ Setting quota to 0 or to @samp{inf} unlimits the download quota.
  @table @samp
  @item -nd
  @itemx --no-directories
-Do not create a hierarchy of directories when retrieving
-recursively. With this option turned on, all files will get saved to the
-current directory, without clobbering (if a name shows up more than
-once, the filenames will get extensions @samp{.n}).
+Do not create a hierarchy of directories when retrieving recursively.
+With this option turned on, all files will get saved to the current
+directory, without clobbering (if a name shows up more than once, the
+filenames will get extensions @samp{.n}).
  
  @item -x
  @itemx --force-directories
@@ -710,8 +820,8 @@ current directory).
  @item -E
  @itemx --html-extension
  If a file of type @samp{text/html} is downloaded and the URL does not
-end with the regexp "\.[Hh][Tt][Mm][Ll]?", this option will cause the
-suffix @samp{.html} to be appended to the local filename.  This is
+end with the regexp @samp{\.[Hh][Tt][Mm][Ll]?}, this option will cause
+the suffix @samp{.html} to be appended to the local filename.  This is
  useful, for instance, when you're mirroring a remote site that uses
  @samp{.asp} pages, but you want the mirrored pages to be viewable on
  your stock Apache server.  Another good use for this is when you're
@@ -720,7 +830,7 @@ downloading the output of CGIs.  A URL like
  @file{article.cgi?25.html}.
  
  Note that filenames changed in this way will be re-downloaded every time
-you re-mirror a site, because wget can't tell that the local
+you re-mirror a site, because Wget can't tell that the local
  @file{@var{X}.html} file corresponds to remote URL @samp{@var{X}} (since
  it doesn't yet know that the URL produces output of type
  @samp{text/html}.  To prevent this re-downloading, you must use
@@ -753,6 +863,30 @@ and flushing out-of-date documents on proxy servers.
  
  Caching is allowed by default.
  
+@cindex cookies
+@item --cookies=on/off
+When set to off, disable the use of cookies.  Cookies are a mechanism
+for maintaining server-side state.  The server sends the client a cookie
+using the @code{Set-Cookie} header, and the client responds with the
+same cookie upon further requests.  Since cookies allow the server
+owners to keep track of visitors and for sites to exchange this
+information, some consider them a breach of privacy.  The default is to
+use cookies; however, @emph{storing} cookies is not on by default.
+
+@cindex loading cookies
+@cindex cookies, loading
+@item --load-cookies @var{file}
+Load cookies from @var{file} before the first HTTP retrieval.  The
+format of @var{file} is one used by Netscape and Mozilla, at least their
+Unix version.
+
+@cindex saving cookies
+@cindex cookies, saving
+@item --save-cookies @var{file}
+Save cookies from @var{file} at the end of session.  Cookies whose
+expiry time is not specified, or those that have already expired, are
+not saved.
+
  @cindex Content-Length, ignore
  @cindex ignore length
  @item --ignore-length
@@ -834,24 +968,31 @@ discouraged, unless you really know what you are doing.
  @section FTP Options
  
  @table @samp
-@cindex symbolic links, retrieving
-@item --retr-symlinks
-Usually, when retrieving @sc{ftp} directories recursively and a symbolic
-link is encountered, the linked-to file is not downloaded.  Instead, a
-matching symbolic link is created on the local filesystem.  The
-pointed-to file will not be downloaded unless this recursive retrieval
-would have encountered it separately and downloaded it anyway.
-
-When @samp{--retr-symlinks} is specified, however, symbolic links are
-traversed and the pointed-to files are retrieved.  At this time, this
-option does not cause wget to traverse symlinks to directories and
-recurse through them, but in the future it should be enhanced to do
-this.
-
-Note that when retrieving a file (not a directory) because it was
-specified on the commandline, rather than because it was recursed to,
-this option has no effect.  Symbolic links are always traversed in this
-case.
+@cindex .listing files, removing
+@item -nr
+@itemx --dont-remove-listing
+Don't remove the temporary @file{.listing} files generated by @sc{ftp}
+retrievals.  Normally, these files contain the raw directory listings
+received from @sc{ftp} servers.  Not removing them can be useful for
+debugging purposes, or when you want to be able to easily check on the
+contents of remote server directories (e.g. to verify that a mirror
+you're running is complete).
+
+Note that even though Wget writes to a known filename for this file,
+this is not a security hole in the scenario of a user making
+@file{.listing} a symbolic link to @file{/etc/passwd} or something and
+asking @code{root} to run Wget in his or her directory.  Depending on
+the options used, either Wget will refuse to write to @file{.listing},
+making the globbing/recursion/time-stamping operation fail, or the
+symbolic link will be deleted and replaced with the actual
+@file{.listing} file, or the listing will be written to a
+@file{.listing.@var{number}} file.
+
+Even though this situation isn't a problem, though, @code{root} should
+never run Wget in a non-trusted user's directory.  A user could do
+something as simple as linking @file{index.html} to @file{/etc/passwd}
+and asking @code{root} to run Wget with @samp{-N} or @samp{-r} so the file
+will be overwritten.
  
  @cindex globbing, toggle
  @item -g on/off
@@ -879,6 +1020,25 @@ servers (and the ones emulating Unix @code{ls} output).
  Use the @dfn{passive} @sc{ftp} retrieval scheme, in which the client
  initiates the data connection.  This is sometimes required for @sc{ftp}
  to work behind firewalls.
+
+@cindex symbolic links, retrieving
+@item --retr-symlinks
+Usually, when retrieving @sc{ftp} directories recursively and a symbolic
+link is encountered, the linked-to file is not downloaded.  Instead, a
+matching symbolic link is created on the local filesystem.  The
+pointed-to file will not be downloaded unless this recursive retrieval
+would have encountered it separately and downloaded it anyway.
+
+When @samp{--retr-symlinks} is specified, however, symbolic links are
+traversed and the pointed-to files are retrieved.  At this time, this
+option does not cause Wget to traverse symlinks to directories and
+recurse through them, but in the future it should be enhanced to do
+this.
+
+Note that when retrieving a file (not a directory) because it was
+specified on the commandline, rather than because it was recursed to,
+this option has no effect.  Symbolic links are always traversed in this
+case.
  @end table
  
  @node Recursive Retrieval Options, Recursive Accept/Reject Options, FTP Options, Invoking
@@ -920,13 +1080,44 @@ created in the first place.
  @cindex link conversion
  @item -k
  @itemx --convert-links
-Convert the non-relative links to relative ones locally.  Only the
-references to the documents actually downloaded will be converted; the
-rest will be left unchanged.
+After the download is complete, convert the links in the document to
+make them suitable for local viewing.  This affects not only the visible
+hyperlinks, but any part of the document that links to external content,
+such as embedded images, links to style sheets, hyperlinks to non-HTML
+content, etc.
+
+Each link will be changed in one of the two ways:
+
+@itemize @bullet
+@item
+The links to files that have been downloaded by Wget will be changed to
+refer to the file they point to as a relative link.
+
+Example: if the downloaded file @file{/foo/doc.html} links to
+@file{/bar/img.gif}, also downloaded, then the link in @file{doc.html}
+will be modified to point to @samp{../bar/img.gif}.  This kind of
+transformation works reliably for arbitrary combinations of directories.
+
+@item
+The links to files that have not been downloaded by Wget will be changed
+to include host name and absolute path of the location they point to.
+
+Example: if the downloaded file @file{/foo/doc.html} links to
+@file{/bar/img.gif} (or to @file{../bar/img.gif}), then the link in
+@file{doc.html} will be modified to point to
+@file{http://@var{hostname}/bar/img.gif}.
+@end itemize
+
+Because of this, local browsing works reliably: if a linked file was
+downloaded, the link will refer to its local name; if it was not
+downloaded, the link will refer to its full Internet address rather than
+presenting a broken link.  The fact that the former links are converted
+to relative links ensures that you can move the downloaded hierarchy to
+another directory.
  
  Note that only at the end of the download can Wget know which links have
-been downloaded.  Because of that, much of the work done by @samp{-k}
-will be performed at the end of the downloads.
+been downloaded.  Because of that, the work done by @samp{-k} will be
+performed at the end of all the downloads.
  
  @cindex backing up converted files
  @item -K
@@ -942,31 +1133,24 @@ and time-stamping, sets infinite recursion depth and keeps @sc{ftp}
  directory listings.  It is currently equivalent to
  @samp{-r -N -l inf -nr}.
  
-@item -nr
-@itemx --dont-remove-listing
-Don't remove the temporary @file{.listing} files generated by @sc{ftp}
-retrievals.  Normally, these files contain the raw directory listings
-received from @sc{ftp} servers.  Not removing them can be useful to
-access the full remote file list when running a mirror, or for debugging
-purposes.
-
  @cindex page requisites
  @cindex required images, downloading
  @item -p
  @itemx --page-requisites
-This option causes wget to download all the files that are necessary to
+This option causes Wget to download all the files that are necessary to
  properly display a given HTML page.  This includes such things as
  inlined images, sounds, and referenced stylesheets.
  
  Ordinarily, when downloading a single HTML page, any requisite documents
  that may be needed to display it properly are not downloaded.  Using
-@samp{-r} together with @samp{-l} can help, but since wget does not
+@samp{-r} together with @samp{-l} can help, but since Wget does not
  ordinarily distinguish between external and inlined documents, one is
-generally left with "leaf documents" that are missing their requisites.
+generally left with ``leaf documents'' that are missing their
+requisites.
  
  For instance, say document @file{1.html} contains an @code{<IMG>} tag
  referencing @file{1.gif} and an @code{<A>} tag pointing to external
-document @file{2.html}.  Say that @file{2.html} is the same but that its
+document @file{2.html}.  Say that @file{2.html} is similar but that its
  image is @file{2.gif} and it links to @file{3.html}.  Say this
  continues up to some arbitrarily high number.
  
@@ -978,7 +1162,7 @@ wget -r -l 2 http://@var{site}/1.html
  
  then @file{1.html}, @file{1.gif}, @file{2.html}, @file{2.gif}, and
  @file{3.html} will be downloaded.  As you can see, @file{3.html} is
-without its requisite @file{3.gif} because wget is simply counting the
+without its requisite @file{3.gif} because Wget is simply counting the
  number of hops (up to 2) away from @file{1.html} in order to determine
  where to stop the recursion.  However, with this command:
  
@@ -1001,16 +1185,17 @@ wget -r -l 0 -p http://@var{site}/1.html
  @end example
  
  would download just @file{1.html} and @file{1.gif}, but unfortunately
-this is not the case, because @samp{-l 0} is equivalent to @samp{-l inf}
--- that is, infinite recursion.  To download a single HTML page (or a
-handful of them, all specified on the commandline or in a @samp{-i} @sc{url}
-input file) and its requisites, simply leave off @samp{-p} and @samp{-l}:
+this is not the case, because @samp{-l 0} is equivalent to
+@samp{-l inf}---that is, infinite recursion.  To download a single HTML
+page (or a handful of them, all specified on the commandline or in a
+@samp{-i} @sc{url} input file) and its (or their) requisites, simply leave off
+@samp{-r} and @samp{-l}:
  
  @example
  wget -p http://@var{site}/1.html
  @end example
  
-Note that wget will behave as if @samp{-r} had been specified, but only
+Note that Wget will behave as if @samp{-r} had been specified, but only
  that single page and its requisites will be downloaded.  Links from that
  page to external documents will not be followed.  Actually, to download
  a single page and all its requisites (even if they exist on separate
@@ -1021,7 +1206,18 @@ likes to use a few options in addition to @samp{-p}:
  wget -E -H -k -K -nh -p http://@var{site}/@var{document}
  @end example
  
-To finish off this topic, it's worth knowing that wget's idea of an
+In one case you'll need to add a couple more options.  If @var{document}
+is a @code{<FRAMESET>} page, the "one more hop" that @samp{-p} gives you
+won't be enough---you'll get the @code{<FRAME>} pages that are
+referenced, but you won't get @emph{their} requisites.  Therefore, in
+this case you'll need to add @samp{-r -l1} to the commandline.  The
+@samp{-r -l1} will recurse from the @code{<FRAMESET>} page to to the
+@code{<FRAME>} pages, and the @samp{-p} will get their requisites.  If
+you're already using a recursion level of 1 or more, you'll need to up
+it by one.  In the future, @samp{-p} may be made smarter so that it'll
+do "two more hops" in the case of a @code{<FRAMESET>} page.
+
+To finish off this topic, it's worth knowing that Wget's idea of an
  external document link is any URL specified in an @code{<A>} tag, an
  @code{<AREA>} tag, or a @code{<LINK>} tag other than @code{<LINK
  REL="stylesheet">}.
@@ -1075,7 +1271,7 @@ wget -Ga,area -H -k -K -nh -r http://@var{site}/@var{document}
  
  However, the author of this option came across a page with tags like
  @code{<LINK REL="home" HREF="/">} and came to the realization that
-@samp{-G} was not enough.  One can't just tell wget to ignore
+@samp{-G} was not enough.  One can't just tell Wget to ignore
  @code{<LINK>}, because then stylesheets will not be downloaded.  Now the
  best bet for downloading a single page and its requisites is the
  dedicated @samp{--page-requisites} option.
@@ -1116,6 +1312,8 @@ This is a useful option, since it guarantees that only the files
  @xref{Directory-Based Limits}, for more details.
  @end table
  
+@c man end
+
  @node Recursive Retrieval, Following Links, Invoking, Top
  @chapter Recursive Retrieval
  @cindex recursion
@@ -1492,8 +1690,8 @@ recently than the local file.
  @end enumerate
  
  To implement this, the program needs to be aware of the time of last
-modification of both remote and local files.  Such information are
-called the @dfn{time-stamps}.
+modification of both local and remote files.  We call this information the
+@dfn{time-stamp} of a file.
  
  The time-stamping in GNU Wget is turned on using @samp{--timestamping}
  (@samp{-N}) option, or through @code{timestamping = on} directive in
@@ -1526,7 +1724,7 @@ wget -S http://www.gnu.ai.mit.edu/
  A simple @code{ls -l} shows that the time stamp on the local file equals
  the state of the @code{Last-Modified} header, as returned by the server.
  As you can see, the time-stamping info is preserved locally, even
-without @samp{-N}.
+without @samp{-N} (at least for @sc{http}).
  
  Several days later, you would like Wget to check if the remote file has
  changed, and download it if it has.
@@ -1536,31 +1734,37 @@ wget -N http://www.gnu.ai.mit.edu/
  @end example
  
  Wget will ask the server for the last-modified date.  If the local file
-is newer, the remote file will not be re-fetched.  However, if the remote
-file is more recent, Wget will proceed fetching it normally.
+has the same timestamp as the server, or a newer one, the remote file
+will not be re-fetched.  However, if the remote file is more recent,
+Wget will proceed to fetch it.
  
  The same goes for @sc{ftp}.  For example:
  
  @example
-wget ftp://ftp.ifi.uio.no/pub/emacs/gnus/*
+wget "ftp://ftp.ifi.uio.no/pub/emacs/gnus/*"
  @end example
  
-@code{ls} will show that the timestamps are set according to the state
-on the remote server.  Reissuing the command with @samp{-N} will make
-Wget re-fetch @emph{only} the files that have been modified.
+(The quotes around that URL are to prevent the shell from trying to
+interpret the @samp{*}.)
  
-In both @sc{http} and @sc{ftp} retrieval Wget will time-stamp the local
-file correctly (with or without @samp{-N}) if it gets the stamps,
-i.e. gets the directory listing for @sc{ftp} or the @code{Last-Modified}
-header for @sc{http}.
+After download, a local directory listing will show that the timestamps
+match those on the remote server.  Reissuing the command with @samp{-N}
+will make Wget re-fetch @emph{only} the files that have been modified
+since the last download.
  
-If you wished to mirror the GNU archive every week, you would use the
-following command every week:
+If you wished to mirror the GNU archive every week, you would use a
+command like the following, weekly:
  
  @example
-wget --timestamping -r ftp://prep.ai.mit.edu/pub/gnu/
+wget --timestamping -r ftp://ftp.gnu.org/pub/gnu/
  @end example
  
+Note that time-stamping will only work for files for which the server
+gives a timestamp.  For @sc{http}, this depends on getting a
+@code{Last-Modified} header.  For @sc{ftp}, this depends on getting a
+directory listing with dates in a format that Wget can parse
+(@pxref{FTP Time-Stamping Internals}).
+
  @node HTTP Time-Stamping Internals, FTP Time-Stamping Internals, Time-Stamping Usage, Time-Stamping
  @section HTTP Time-Stamping Internals
  @cindex http time-stamping
@@ -1598,13 +1802,17 @@ Arguably, @sc{http} time-stamping should be implemented using the
  @cindex ftp time-stamping
  
  In theory, @sc{ftp} time-stamping works much the same as @sc{http}, only
-@sc{ftp} has no headers---time-stamps must be received from the
-directory listings.
-
-For each directory files must be retrieved from, Wget will use the
-@code{LIST} command to get the listing.  It will try to analyze the
-listing, assuming that it is a Unix @code{ls -l} listing, and extract
-the time-stamps.  The rest is exactly the same as for @sc{http}.
+@sc{ftp} has no headers---time-stamps must be ferreted out of directory
+listings.
+
+If an @sc{ftp} download is recursive or uses globbing, Wget will use the
+@sc{ftp} @code{LIST} command to get a file listing for the directory
+containing the desired file(s).  It will try to analyze the listing,
+treating it like Unix @code{ls -l} output, extracting the time-stamps.
+The rest is exactly the same as for @sc{http}.  Note that when
+retrieving individual files from an @sc{ftp} server without using
+globbing or recursion, listing files will not be downloaded (and thus
+files will not be time-stamped) unless @samp{-N} is specified.
  
  Assumption that every directory listing is a Unix-style listing may
  sound extremely constraining, but in practice it is not, as many
@@ -1701,10 +1909,10 @@ reject =
  The complete set of commands is listed below.  Legal values are listed
  after the @samp{=}.  Simple Boolean values can be set or unset using
  @samp{on} and @samp{off} or @samp{1} and @samp{0}.  A fancier kind of
-Boolean allowed in some cases is the "lockable" Boolean, which may be
-set to @samp{on}, @samp{off}, @samp{always}, or @samp{never}.  If an
+Boolean allowed in some cases is the @dfn{lockable Boolean}, which may
+be set to @samp{on}, @samp{off}, @samp{always}, or @samp{never}.  If an
  option is set to @samp{always} or @samp{never}, that value will be
-locked in for the duration of the wget invocation -- commandline options
+locked in for the duration of the Wget invocation---commandline options
  will not override.
  
  Some commands take pseudo-arbitrary values.  @var{address} values can be
@@ -1723,24 +1931,24 @@ Same as @samp{-A}/@samp{-R} (@pxref{Types of Files}).
  Enable/disable host-prefixed file names.  @samp{-nH} disables it.
  
  @item continue = on/off
-Enable/disable continuation of the retrieval -- the same as @samp{-c}
-(which enables it).
+If set to on, force continuation of preexistent partially retrieved
+files.  See @samp{-c} before setting it.
  
  @item background = on/off
-Enable/disable going to background -- the same as @samp{-b} (which enables
-it).
+Enable/disable going to background---the same as @samp{-b} (which
+enables it).
  
  @item backup_converted = on/off
-Enable/disable saving pre-converted files with the suffix @samp{.orig}
--- the same as @samp{-K} (which enables it).
+Enable/disable saving pre-converted files with the suffix
+@samp{.orig}---the same as @samp{-K} (which enables it).
  
  @c @item backups = @var{number}
  @c #### Document me!
  @c
  @item base = @var{string}
  Consider relative @sc{url}s in @sc{url} input files forced to be
-interpreted as @sc{html} as being relative to @var{string} -- the same
-as @samp{-B}.
+interpreted as @sc{html} as being relative to @var{string}---the same as
+@samp{-B}.
  
  @item bind_address = @var{address}
  Bind to @var{address}, like the @samp{--bind-address} option.
@@ -1751,6 +1959,15 @@ When set to off, disallow server-caching.  See the @samp{-C} option.
  @item convert links = on/off
  Convert non-relative links locally.  The same as @samp{-k}.
  
+@item cookies = on/off
+When set to off, disallow cookies.  See the @samp{--cookies} option.
+
+@item load_cookies = @var{file}
+Load cookies from @var{file}.  See @samp{--load-cookies}.
+
+@item save_cookies = @var{file}
+Save cookies to @var{file}.  See @samp{--save-cookies}.
+
  @item cut_dirs = @var{n}
  Ignore @var{n} remote directory components.
  
@@ -1758,13 +1975,13 @@ Ignore @var{n} remote directory components.
  Debug mode, same as @samp{-d}.
  
  @item delete_after = on/off
-Delete after download -- the same as @samp{--delete-after}.
+Delete after download---the same as @samp{--delete-after}.
  
  @item dir_prefix = @var{string}
-Top of directory tree -- the same as @samp{-P}.
+Top of directory tree---the same as @samp{-P}.
  
  @item dirstruct = on/off
-Turning dirstruct on or off -- the same as @samp{-x} or @samp{-nd},
+Turning dirstruct on or off---the same as @samp{-x} or @samp{-nd},
  respectively.
  
  @item domains = @var{string}
@@ -1785,33 +2002,31 @@ the retrieval (50 by default).
  @item dot_spacing = @var{n}
  Specify the number of dots in a single cluster (10 by default).
  
-@item dot_style = @var{string}
-Specify the dot retrieval @dfn{style}, as with @samp{--dot-style}.
-
  @item exclude_directories = @var{string}
  Specify a comma-separated list of directories you wish to exclude from
-download -- the same as @samp{-X} (@pxref{Directory-Based Limits}).
+download---the same as @samp{-X} (@pxref{Directory-Based Limits}).
  
  @item exclude_domains = @var{string}
  Same as @samp{--exclude-domains} (@pxref{Domain Acceptance}).
  
  @item follow_ftp = on/off
-Follow @sc{ftp} links from @sc{html} documents -- the same as @samp{-f}.
+Follow @sc{ftp} links from @sc{html} documents---the same as
+@samp{--follow-ftp}.
  
  @item follow_tags = @var{string}
  Only follow certain HTML tags when doing a recursive retrieval, just like
-@samp{--follow-tags}. 
+@samp{--follow-tags}.
  
  @item force_html = on/off
  If set to on, force the input filename to be regarded as an @sc{html}
-document -- the same as @samp{-F}.
+document---the same as @samp{-F}.
  
  @item ftp_proxy = @var{string}
  Use @var{string} as @sc{ftp} proxy, instead of the one specified in
  environment.
  
  @item glob = on/off
-Turn globbing on/off -- the same as @samp{-g}.
+Turn globbing on/off---the same as @samp{-g}.
  
  @item header = @var{string}
  Define an additional header, like @samp{--header}.
@@ -1836,23 +2051,23 @@ When set to on, ignore @code{Content-Length} header; the same as
  
  @item ignore_tags = @var{string}
  Ignore certain HTML tags when doing a recursive retrieval, just like
-@samp{-G} / @samp{--ignore-tags}. 
+@samp{-G} / @samp{--ignore-tags}.
  
  @item include_directories = @var{string}
  Specify a comma-separated list of directories you wish to follow when
-downloading -- the same as @samp{-I}.
+downloading---the same as @samp{-I}.
  
  @item input = @var{string}
  Read the @sc{url}s from @var{string}, like @samp{-i}.
  
  @item kill_longer = on/off
-Consider data longer than specified in content-length header
-as invalid (and retry getting it). The default behaviour is to save
-as much data as there is, provided there is more than or equal
-to the value in @code{Content-Length}.
+Consider data longer than specified in content-length header as invalid
+(and retry getting it).  The default behaviour is to save as much data
+as there is, provided there is more than or equal to the value in
+@code{Content-Length}.
  
  @item logfile = @var{string}
-Set logfile -- the same as @samp{-o}.
+Set logfile---the same as @samp{-o}.
  
  @item login = @var{string}
  Your user name on the remote machine, for @sc{ftp}.  Defaults to
@@ -1876,14 +2091,14 @@ Use @var{string} as the comma-separated list of domains to avoid in
  proxy loading, instead of the one specified in environment.
  
  @item output_document = @var{string}
-Set the output filename -- the same as @samp{-O}.
+Set the output filename---the same as @samp{-O}.
  
  @item page_requisites = on/off
  Download all ancillary documents necessary for a single HTML page to
-display properly -- the same as @samp{-p}.
+display properly---the same as @samp{-p}.
  
  @item passive_ftp = on/off/always/never
-Set passive @sc{ftp} -- the same as @samp{--passive-ftp}.  Some scripts
+Set passive @sc{ftp}---the same as @samp{--passive-ftp}.  Some scripts
  and @samp{.pm} (Perl module) files download files using @samp{wget
  --passive-ftp}.  If your firewall does not allow this, you can set
  @samp{passive_ftp = never} to override the commandline.
@@ -1892,6 +2107,10 @@ and @samp{.pm} (Perl module) files download files using @samp{wget
  Set your @sc{ftp} password to @var{password}.  Without this setting, the
  password defaults to @samp{username@@hostname.domainname}.
  
+@item progress = @var{string}
+Set the type of the progress indicator.  Legal types are ``dot'' and
+``bar''.
+
  @item proxy_user = @var{string}
  Set proxy authentication user name to @var{string}, like @samp{--proxy-user}.
  
@@ -1901,27 +2120,28 @@ Set proxy authentication password to @var{string}, like @samp{--proxy-passwd}.
  @item referer = @var{string}
  Set HTTP @samp{Referer:} header just like @samp{--referer}.  (Note it
  was the folks who wrote the @sc{http} spec who got the spelling of
-"referrer" wrong.)
+``referrer'' wrong.)
  
  @item quiet = on/off
-Quiet mode -- the same as @samp{-q}.
+Quiet mode---the same as @samp{-q}.
  
  @item quota = @var{quota}
  Specify the download quota, which is useful to put in the global
-@file{wgetrc}. When download quota is specified, Wget will stop retrieving
-after the download sum has become greater than quota.  The quota can be
-specified in bytes (default), kbytes @samp{k} appended) or mbytes
-(@samp{m} appended).  Thus @samp{quota = 5m} will set the quota to 5
-mbytes. Note that the user's startup file overrides system settings.
+@file{wgetrc}.  When download quota is specified, Wget will stop
+retrieving after the download sum has become greater than quota.  The
+quota can be specified in bytes (default), kbytes @samp{k} appended) or
+mbytes (@samp{m} appended).  Thus @samp{quota = 5m} will set the quota
+to 5 mbytes.  Note that the user's startup file overrides system
+settings.
  
  @item reclevel = @var{n}
-Recursion level -- the same as @samp{-l}.
+Recursion level---the same as @samp{-l}.
  
  @item recursive = on/off
-Recursive on/off -- the same as @samp{-r}.
+Recursive on/off---the same as @samp{-r}.
  
  @item relative_only = on/off
-Follow only relative links -- the same as @samp{-L} (@pxref{Relative
+Follow only relative links---the same as @samp{-L} (@pxref{Relative
  Links}).
  
  @item remove_listing = on/off
@@ -1938,7 +2158,7 @@ what you are doing before changing the default (which is @samp{on}).
  
  @item server_response = on/off
  Choose whether or not to print the @sc{http} and @sc{ftp} server
-responses -- the same as @samp{-S}.
+responses---the same as @samp{-S}.
  
  @item simple_host_check = on/off
  Same as @samp{-nh} (@pxref{Host Checking}).
@@ -1947,27 +2167,31 @@ Same as @samp{-nh} (@pxref{Host Checking}).
  Same as @samp{-H}.
  
  @item timeout = @var{n}
-Set timeout value -- the same as @samp{-T}.
+Set timeout value---the same as @samp{-T}.
  
  @item timestamping = on/off
-Turn timestamping on/off. The same as @samp{-N} (@pxref{Time-Stamping}).
+Turn timestamping on/off.  The same as @samp{-N} (@pxref{Time-Stamping}).
  
  @item tries = @var{n}
-Set number of retries per @sc{url} -- the same as @samp{-t}.
+Set number of retries per @sc{url}---the same as @samp{-t}.
  
  @item use_proxy = on/off
-Turn proxy support on/off. The same as @samp{-Y}.
+Turn proxy support on/off.  The same as @samp{-Y}.
  
  @item verbose = on/off
-Turn verbose on/off -- the same as @samp{-v}/@samp{-nv}.
+Turn verbose on/off---the same as @samp{-v}/@samp{-nv}.
  
  @item wait = @var{n}
-Wait @var{n} seconds between retrievals -- the same as @samp{-w}.
+Wait @var{n} seconds between retrievals---the same as @samp{-w}.
  
  @item waitretry = @var{n}
-Wait up to @var{n} seconds between retries of failed retrievals only --
-the same as @samp{--waitretry}.  Note that this is turned on by default
-in the global @file{wgetrc}.
+Wait up to @var{n} seconds between retries of failed retrievals
+only---the same as @samp{--waitretry}.  Note that this is turned on by
+default in the global @file{wgetrc}.
+
+@item randomwait = on/off
+Turn random between-request wait times on or off. The same as 
+@samp{--random-wait}.
  @end table
  
  @node Sample Wgetrc,  , Wgetrc Commands, Startup File
@@ -2297,7 +2521,7 @@ This variable should contain the @sc{url} of the proxy for @sc{http}
  connections.
  
  @item ftp_proxy
-This variable should contain the @sc{url} of the proxy for @sc{http}
+This variable should contain the @sc{url} of the proxy for @sc{ftp}
  connections.  It is quite common that @sc{http_proxy} and @sc{ftp_proxy}
  are set to the same @sc{url}.
  
@@ -2334,8 +2558,9 @@ authentication schemes exist.  For proxy authorization only the
  
  You may specify your username and password either through the proxy
  @sc{url} or through the command-line options.  Assuming that the
-company's proxy is located at @samp{proxy.srce.hr} at port 8001, a proxy
-@sc{url} location containing authorization data might look like this:
+company's proxy is located at @samp{proxy.company.com} at port 8001, a
+proxy @sc{url} location containing authorization data might look like
+this:
  
  @example
  http://hniksic:mypassword@@proxy.company.com:8001/
@@ -2360,28 +2585,29 @@ Wget @value{VERSION} can be found at
  @cindex mailing list
  @cindex list
  
-Wget has its own mailing list at @email{wget@@sunsite.auc.dk}, thanks
+Wget has its own mailing list at @email{wget@@sunsite.dk}, thanks
  to Karsten Thygesen.  The mailing list is for discussion of Wget
  features and web, reporting Wget bugs (those that you think may be of
  interest to the public) and mailing announcements.  You are welcome to
  subscribe.  The more people on the list, the better!
  
-To subscribe, send mail to @email{wget-subscribe@@sunsite.auc.dk}.
+To subscribe, send mail to @email{wget-subscribe@@sunsite.dk}.
  the magic word @samp{subscribe} in the subject line.  Unsubscribe by
-mailing to @email{wget-unsubscribe@@sunsite.auc.dk}.
+mailing to @email{wget-unsubscribe@@sunsite.dk}.
  
  The mailing list is archived at @url{http://fly.srk.fer.hr/archive/wget}.
-
+Alternative archive is available at
+@url{http://www.mail-archive.com/wget%40sunsite.auc.dk/}.
+ 
  @node Reporting Bugs, Portability, Mailing List, Various
  @section Reporting Bugs
  @cindex bugs
  @cindex reporting bugs
  @cindex bug reports
  
+@c man begin BUGS
  You are welcome to send bug reports about GNU Wget to
-@email{bug-wget@@gnu.org}.  The bugs that you think are of the
-interest to the public (i.e. more people should be informed about them)
-can be Cc-ed to the mailing list at @email{wget@@sunsite.auc.dk}.
+@email{bug-wget@@gnu.org}.
  
  Before actually submitting a bug report, please try to follow a few
  simple guidelines.
@@ -2419,6 +2645,7 @@ wget` core} and type @code{where} to get the backtrace.
  @item
  Find where the bug is, fix it and send me the patches. :-)
  @end enumerate
+@c man end
  
  @node Portability, Signals, Reporting Bugs, Various
  @section Portability
@@ -2446,7 +2673,7 @@ features available on Unix, but it should work as a substitute for
  people stuck with Windows.  Note that the Windows port is
  @strong{neither tested nor maintained} by me---all questions and
  problems should be reported to Wget mailing list at
-@email{wget@@sunsite.auc.dk} where the maintainers will look at them.
+@email{wget@@sunsite.dk} where the maintainers will look at them.
  
  @node Signals,  , Portability, Various
  @section Signals
@@ -2464,9 +2691,8 @@ $ wget http://www.ifi.uio.no/~larsi/gnus.tar.gz &
  $ kill -HUP %%     # Redirect the output to wget-log
  @end example
  
-Other than that, Wget will not try to interfere with signals in any
-way. @kbd{C-c}, @code{kill -TERM} and @code{kill -KILL} should kill it
-alike.
+Other than that, Wget will not try to interfere with signals in any way.
+@kbd{C-c}, @code{kill -TERM} and @code{kill -KILL} should kill it alike.
  
  @node Appendices, Copying, Various, Top
  @chapter Appendices
@@ -2485,10 +2711,26 @@ This chapter contains some references I consider useful.
  @cindex robots.txt
  @cindex server maintenance
  
-Since Wget is able to traverse the web, it counts as one of the Web
-@dfn{robots}.  Thus Wget understands @dfn{Robots Exclusion Standard}
-(@sc{res})---contents of @file{/robots.txt}, used by server
-administrators to shield parts of their systems from wanderings of Wget.
+It is extremely easy to make Wget wander aimlessly around a web site,
+sucking all the available data in progress.  @samp{wget -r @var{site}},
+and you're set.  Great?  Not for the server admin.
+
+While Wget is retrieving static pages, there's not much of a problem.
+But for Wget, there is no real difference between the smallest static
+page and the hardest, most demanding CGI or dynamic page.  For instance,
+a site I know has a section handled by an, uh, bitchin' CGI script that
+converts all the Info files to HTML.  The script can and does bring the
+machine to its knees without providing anything useful to the
+downloader.
+
+For such and similar cases various robot exclusion schemes have been
+devised as a means for the server administrators and document authors to
+protect chosen portions of their sites from the wandering of robots.
+
+The more popular mechanism is the @dfn{Robots Exclusion Standard}
+written by Martijn Koster et al. in 1994.  It is specified by placing a
+file named @file{/robots.txt} in the server root, which the robots are
+supposed to download and parse.  Wget supports this specification.
  
  Norobots support is turned on only when retrieving recursively, and
  @emph{never} for the first page.  Thus, you may issue:
@@ -2500,8 +2742,7 @@ wget -r http://fly.srk.fer.hr/
  First the index of fly.srk.fer.hr will be downloaded.  If Wget finds
  anything worth downloading on the same host, only @emph{then} will it
  load the robots, and decide whether or not to load the links after all.
-@file{/robots.txt} is loaded only once per host.  Wget does not support
-the robots @code{META} tag.
+@file{/robots.txt} is loaded only once per host.
  
  Note that the exlusion standard discussed here has undergone some
  revisions.  However, but Wget supports only the first version of
@@ -2517,6 +2758,20 @@ but we plan to add them.
  
  This manual no longer includes the text of the old standard.
  
+The second, less known mechanism, enables the author of an individual
+document to specify whether they want the links from the file to be
+followed by a robot.  This is achieved using the @code{META} tag, like
+this:
+
+@example
+<meta name="robots" content="nofollow">
+@end example
+
+This is explained in some detail at
+@url{http://info.webcrawler.com/mak/projects/robots/meta-user.html}.
+Wget supports this method of robot exclusion in addition to the usual
+@file{/robots.txt} exclusion.
+
  @node Security Considerations, Contributors, Robots, Appendices
  @section Security Considerations
  @cindex security
@@ -2633,6 +2888,7 @@ Martin Baehr,
  Dieter Baron,
  Roger Beeman and the Gurus at Cisco,
  Dan Berger,
+Paul Bludov,
  Mark Boyns,
  John Burden,
  Wanderlei Cavassin,
@@ -2663,6 +2919,7 @@ Aleksandar Erkalovi@'{c},
  Aleksandar Erkalovic,
  @end ifinfo
  Andy Eskilsson,
+Christian Fraenkel,
  Masashi Fujita,
  Howard Gayle,
  Marcel Gerrits,
@@ -2675,6 +2932,7 @@ HIROSE Masaaki,
  Gregor Hoffleit,
  Erik Magnus Hulthen,
  Richard Huveneers,
+Jonas Jensen,
  Simon Josefsson,
  @iftex
  Mario Juri@'{c},
@@ -2682,6 +2940,12 @@ Mario Juri@'{c},
  @ifinfo
  Mario Juric,
  @end ifinfo
+@iftex
+Hack Kampbj@o rn,
+@end iftex
+@ifinfo
+Hack Kampbjorn,
+@end ifinfo
  Const Kaplinsky,
  @iftex
  Goran Kezunovi@'{c},
@@ -2690,6 +2954,7 @@ Goran Kezunovi@'{c},
  Goran Kezunovic,
  @end ifinfo
  Robert Kleine,
+KOJIMA Haime,
  Fila Kolodny,
  Alexander Kourakos,
  Martin Kraemer,
@@ -2703,17 +2968,35 @@ Simos KSenitellis,
  @end ifinfo
  Hrvoje Lacko,
  Daniel S. Lewart,
+@iftex
+Nicol@'{a}s Lichtmeier,
+@end iftex
+@ifinfo
+Nicolas Lichtmeier,
+@end ifinfo
  Dave Love,
  Alexander V. Lukyanov,
  Jordan Mendelson,
  Lin Zhe Min,
+Tim Mooney,
  Simon Munton,
  Charlie Negyesi,
  R. K. Owen,
  Andrew Pollock,
  Steve Pothier,
+@iftex
+Jan P@v{r}ikryl,
+@end iftex
+@ifinfo
  Jan Prikryl,
+@end ifinfo
  Marin Purgar,
+@iftex
+Csaba R@'{a}duly,
+@end iftex
+@ifinfo
+Csaba Raduly,
+@end ifinfo
  Keith Refson,
  Tyler Riddle,
  Tobias Ringstrom,
@@ -2732,8 +3015,10 @@ Toomas Soome,
  Tage Stabell-Kulo,
  Sven Sternberger,
  Markus Strasser,
+John Summerfield,
  Szakacsits Szabolcs,
  Mike Thomas,
+Philipp Thomas,
  Russell Vincent,
  Charles G Waldman,
  Douglas E. Wegscheid,
@@ -2754,33 +3039,43 @@ subscribers of the Wget mailing list.
  @cindex copying
  @cindex GPL
  @cindex GFDL
-
-Wget is @dfn{free software}, where ``free'' refers to liberty, not
-price.  The exact legal distribution terms follow below, but in short,
-it means that you have the right (freedom) to run and change and copy
-Wget, and even---if you want---charge money for any of those things.
-The sole restriction is that you have to grant your recipients the same
-rights.
-
-This method of licensing software is also known as @dfn{open-source},
-because it requires that the recipients always receive a program's
-source code along with the program.
-
-More specifically:
+@cindex free software
+
+GNU Wget is licensed under the GNU GPL, which makes it @dfn{free
+software}.
+
+Please note that ``free'' in ``free software'' refers to liberty, not
+price.  As some GNU project advocates like to point out, think of ``free
+speech'' rather than ``free beer''.  The exact and legally binding
+distribution terms are spelled out below; in short, you have the right
+(freedom) to run and change Wget and distribute it to other people, and
+even---if you want---charge money for doing either.  The important
+restriction is that you have to grant your recipients the same rights
+and impose the same restrictions.
+
+This method of licensing software is also known as @dfn{open source}
+because, among other things, it makes sure that all recipients will
+receive the source code along with the program, and be able to improve
+it.  The GNU project prefers the term ``free software'' for reasons
+outlined at
+@url{http://www.gnu.org/philosophy/free-software-for-freedom.html}.
+
+The exact license terms are defined by this paragraph and the GNU
+General Public License it refers to:
  
  @quotation
-This program is free software; you can redistribute it and/or modify it
+GNU Wget is free software; you can redistribute it and/or modify it
  under the terms of the GNU General Public License as published by the
  Free Software Foundation; either version 2 of the License, or (at your
  option) any later version.
  
-This program is distributed in the hope that it will be useful, but
-WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-General Public License for more details.
+GNU Wget is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
  
-You should have received a copy of the GNU General Public License
-along with this program; if not, write to the Free Software
+A copy of the GNU General Public License is included as part of this
+manual; if you did not receive it, write to the Free Software
  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  @end quotation
  
@@ -2789,10 +3084,11 @@ In addition to this, this manual is free in the same sense:
  @quotation
  Permission is granted to copy, distribute and/or modify this document
  under the terms of the GNU Free Documentation License, Version 1.1 or
-any later version published by the Free Software Foundation; with no
-Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
-Texts.  A copy of the license is included in the section entitled ``GNU
-Free Documentation License''.
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``GNU Free
+Documentation License'', with no Front-Cover Texts, and with no
+Back-Cover Texts.  A copy of the license is included in the section
+entitled ``GNU Free Documentation License''.
  @end quotation
  
  @c #### Maybe we should wrap these licenses in ifinfo?  Stallman says