@sc{html} files on your local disk, by adding @code{<base
href="@var{url}">} to @sc{html}, or using the @samp{--base} command-line
option.
+
+@cindex base for relative links in input file
+@item -B @var{URL}
+@itemx --base=@var{URL}
+When used in conjunction with @samp{-F}, prepends @var{URL} to relative
+links in the file specified by @samp{-i}.
@end table
@node Download Options, Directory Options, Logging and Input File Options, Invoking
remote file to @file{ls-lR.Z.1}. The @samp{-c} option is also
applicable for @sc{http} servers that support the @code{Range} header.
+Note that if you use @samp{-c} on a file that's already downloaded
+completely, @samp{@var{file}} will not be changed, nor will a second
+@samp{@var{file}.1} copy be created.
+
@cindex dot style
@cindex retrieval tracing style
@item --dot-style=@var{style}
@section HTTP Options
@table @samp
+@cindex .html extension
+@item -E
+@itemx --html-extension
+If a file of type @samp{text/html} is downloaded and the URL does not
+end with the regexp "\.[Hh][Tt][Mm][Ll]?", this option will cause the
+suffix @samp{.html} to be appended to the local filename. This is
+useful, for instance, when you're you're mirroring a remote site that
+uses @samp{.asp} pages, but you want the mirrored pages to be viewable
+on your stock Apache server. Another good use for this is when you're
+downloading the output of CGIs. A URL like
+@samp{http://site.com/article.cgi?25} will be saved as
+@file{article.cgi?25.html}.
+
+Note that filenames changed in this way will be re-downloaded every time
+you re-mirror a site, because wget can't tell that the local
+@file{@var{X}.html} file corresponds to remote URL @samp{@var{X}} (since
+it doesn't yet know that the URL produces output of type
+@samp{text/html}. To prevent this re-downloading, you must use
+@samp{-k} and @samp{-K} so that the original version of the file will be
+saved as @file{@var{X}.orig} (@xref{Recursive Retrieval Options}).
+
@cindex http user
@cindex http password
@cindex authentication
@section FTP Options
@table @samp
-@cindex retrieve symbolic links
+@cindex symbolic links, retrieving
@item --retr-symlinks
-Retrieve symbolic links on @sc{ftp} sites as if they were plain files,
-i.e. don't just create links locally.
+Usually, when retrieving @sc{ftp} directories recursively and a symbolic
+link is encountered, the linked-to file is not downloaded. Instead, a
+matching symbolic link is created on the local filesystem. The
+pointed-to file will not be downloaded unless this recursive retrieval
+would have encountered it separately and downloaded it anyway.
+
+When @samp{--retr-symlinks} is specified, however, symbolic links are
+traversed and the pointed-to files are retrieved. At this time, this
+option does not cause wget to traverse symlinks to directories and
+recurse through them, but in the future it should be enhanced to do
+this.
+
+Note that when retrieving a file (not a directory) because it was
+specified on the commandline, rather than because it was recursed to,
+this option has no effect. Symbolic links are always traversed in this
+case.
@cindex globbing, toggle
@item -g on/off
received from @sc{ftp} servers. Not removing them can be useful to
access the full remote file list when running a mirror, or for debugging
purposes.
+
+@cindex page requisites
+@cindex required images, downloading
+@item -p
+@itemx --page-requisites
+This option causes wget to download all the files that are necessary to
+properly display a given HTML page. This includes such things as
+inlined images, sounds, and referenced stylesheets.
+
+Ordinarily, when downloading a single HTML page, any requisite documents
+that may be needed to display it properly are not downloaded. Using
+@samp{-r} together with @samp{-l} can help, but since wget does not
+ordinarily distinguish between external and inlined documents, one is
+generally left with "leaf documents" that are missing their requisites.
+
+For instance, say document @file{1.html} contains an @code{<IMG>} tag
+referencing @file{1.gif} and an @code{<A>} tag pointing to external
+document @file{2.html}. Say that @file{2.html} is the same but that its
+image is @file{2.gif} and it links to @file{3.html}. Say this
+continues up to some arbitrarily high number.
+
+If one executes the command:
+
+@example
+wget -r -l 2 http://@var{site}/1.html
+@end example
+
+then @file{1.html}, @file{1.gif}, @file{2.html}, @file{2.gif}, and
+@file{3.html} will be downloaded. As you can see, @file{3.html} is
+without its requisite @file{3.gif} because wget is simply counting the
+number of hops (up to 2) away from @file{1.html} in order to determine
+where to stop the recursion. However, with this command:
+
+@example
+wget -r -l 2 -p http://@var{site}/1.html
+@end example
+
+all the above files @emph{and} @file{3.html}'s requisite @file{3.gif}
+will be downloaded. Similarly,
+
+@example
+wget -r -l 1 -p http://@var{site}/1.html
+@end example
+
+will cause @file{1.html}, @file{1.gif}, @file{2.html}, and @file{2.gif}
+to be downloaded. One might think that:
+
+@example
+wget -r -l 0 -p http://@var{site}/1.html
+@end example
+
+would download just @file{1.html} and @file{1.gif}, but unfortunately
+this is not the case, because @samp{-l 0} is equivalent to @samp{-l inf}
+-- that is, infinite recursion. To download a single HTML page (or a
+handful of them, all specified on the commandline or in a @samp{-i} @sc{url}
+input file) and its requisites, simply leave off @samp{-p} and @samp{-l}:
+
+@example
+wget -p http://@var{site}/1.html
+@end example
+
+Note that wget will behave as if @samp{-r} had been specified, but only
+that single page and its requisites will be downloaded. Links from that
+page to external documents will not be followed. Actually, to download
+a single page and all its requisites (even if they exist on separate
+websites), and make sure the lot displays properly locally, this author
+likes to use a few options in addition to @samp{-p}:
+
+@example
+wget -H -k -K -nh -p http://@var{site}/@var{document}
+@end example
+
+To finish off this topic, it's worth knowing that wget's idea of an
+external document link is any URL specified in an @code{<A>} tag, an
+@code{<AREA>} tag, or a @code{<LINK>} tag other than @code{<LINK
+REL="stylesheet">}.
@end table
@node Recursive Accept/Reject Options, , Recursive Retrieval Options, Invoking
@itemx --ignore-tags=@var{list}
This is the opposite of the @samp{--follow-tags} option. To skip
certain HTML tags when recursively looking for documents to download,
-specify them in a comma-separated @var{list}. The author of this option
-likes to use the following command to download a single HTML page and
-all files (e.g. images, sounds, and stylesheets) necessary to display it
-properly:
+specify them in a comma-separated @var{list}.
+
+In the past, the @samp{-G} option was the best bet for downloading a
+single page and its requisites, using a commandline like:
@example
wget -Ga,area -H -k -K -nh -r http://@var{site}/@var{document}
@end example
+However, the author of this option came across a page with tags like
+@code{<LINK REL="home" HREF="/">} and came to the realization that
+@samp{-G} was not enough. One can't just tell wget to ignore
+@code{<LINK>}, because then stylesheets will not be downloaded. Now the
+best bet for downloading a single page and its requisites is the
+dedicated @samp{--page-requisites} option.
+
@item -H
@itemx --span-hosts
Enable spanning across hosts when doing recursive retrieving (@xref{All
Enable/disable host-prefixed file names. @samp{-nH} disables it.
@item continue = on/off
-Enable/disable continuation of the retrieval, the same as @samp{-c}
+Enable/disable continuation of the retrieval -- the same as @samp{-c}
(which enables it).
@item background = on/off
-Enable/disable going to background, the same as @samp{-b} (which enables
+Enable/disable going to background -- the same as @samp{-b} (which enables
it).
@item backup_converted = on/off
@c @item backups = @var{number}
@c #### Document me!
+@c
@item base = @var{string}
-Set base for relative @sc{url}s, the same as @samp{-B}.
+Consider relative @sc{url}s in @sc{url} input files forced to be
+interpreted as @sc{html} as being relative to @var{string} -- the same
+as @samp{-B}.
@item cache = on/off
When set to off, disallow server-caching. See the @samp{-C} option.
Debug mode, same as @samp{-d}.
@item delete_after = on/off
-Delete after download, the same as @samp{--delete-after}.
+Delete after download -- the same as @samp{--delete-after}.
@item dir_prefix = @var{string}
-Top of directory tree, the same as @samp{-P}.
+Top of directory tree -- the same as @samp{-P}.
@item dirstruct = on/off
-Turning dirstruct on or off, the same as @samp{-x} or @samp{-nd},
+Turning dirstruct on or off -- the same as @samp{-x} or @samp{-nd},
respectively.
@item domains = @var{string}
@item exclude_directories = @var{string}
Specify a comma-separated list of directories you wish to exclude from
-download, the same as @samp{-X} (@xref{Directory-Based Limits}).
+download -- the same as @samp{-X} (@xref{Directory-Based Limits}).
@item exclude_domains = @var{string}
Same as @samp{--exclude-domains} (@xref{Domain Acceptance}).
@item follow_ftp = on/off
-Follow @sc{ftp} links from @sc{html} documents, the same as @samp{-f}.
+Follow @sc{ftp} links from @sc{html} documents -- the same as @samp{-f}.
@item follow_tags = @var{string}
Only follow certain HTML tags when doing a recursive retrieval, just like
@item force_html = on/off
If set to on, force the input filename to be regarded as an @sc{html}
-document, the same as @samp{-F}.
+document -- the same as @samp{-F}.
@item ftp_proxy = @var{string}
Use @var{string} as @sc{ftp} proxy, instead of the one specified in
environment.
@item glob = on/off
-Turn globbing on/off, the same as @samp{-g}.
+Turn globbing on/off -- the same as @samp{-g}.
@item header = @var{string}
Define an additional header, like @samp{--header}.
+@item html_extension = on/off
+Add a @samp{.html} extension to @samp{text/html} files without it, like
+@samp{-E}.
+
@item http_passwd = @var{string}
Set @sc{http} password.
@item include_directories = @var{string}
Specify a comma-separated list of directories you wish to follow when
-downloading, the same as @samp{-I}.
+downloading -- the same as @samp{-I}.
@item input = @var{string}
Read the @sc{url}s from @var{string}, like @samp{-i}.
to the value in @code{Content-Length}.
@item logfile = @var{string}
-Set logfile, the same as @samp{-o}.
+Set logfile -- the same as @samp{-o}.
@item login = @var{string}
Your user name on the remote machine, for @sc{ftp}. Defaults to
proxy loading, instead of the one specified in environment.
@item output_document = @var{string}
-Set the output filename, the same as @samp{-O}.
+Set the output filename -- the same as @samp{-O}.
+
+@item page_requisites = on/off
+Download all ancillary documents necessary for a single HTML page to
+display properly -- the same as @samp{-p}.
@item passive_ftp = on/off
-Set passive @sc{ftp}, the same as @samp{--passive-ftp}.
+Set passive @sc{ftp} -- the same as @samp{--passive-ftp}.
@item passwd = @var{string}
Set your @sc{ftp} password to @var{password}. Without this setting, the
@samp{--proxy-passwd}.
@item quiet = on/off
-Quiet mode, the same as @samp{-q}.
+Quiet mode -- the same as @samp{-q}.
@item quota = @var{quota}
Specify the download quota, which is useful to put in the global
mbytes. Note that the user's startup file overrides system settings.
@item reclevel = @var{n}
-Recursion level, the same as @samp{-l}.
+Recursion level -- the same as @samp{-l}.
@item recursive = on/off
-Recursive on/off, the same as @samp{-r}.
+Recursive on/off -- the same as @samp{-r}.
@item relative_only = on/off
-Follow only relative links, the same as @samp{-L} (@xref{Relative
+Follow only relative links -- the same as @samp{-L} (@xref{Relative
Links}).
@item remove_listing = on/off
@item server_response = on/off
Choose whether or not to print the @sc{http} and @sc{ftp} server
-responses, the same as @samp{-S}.
+responses -- the same as @samp{-S}.
@item simple_host_check = on/off
Same as @samp{-nh} (@xref{Host Checking}).
Same as @samp{-H}.
@item timeout = @var{n}
-Set timeout value, the same as @samp{-T}.
+Set timeout value -- the same as @samp{-T}.
@item timestamping = on/off
Turn timestamping on/off. The same as @samp{-N} (@xref{Time-Stamping}).
@item tries = @var{n}
-Set number of retries per @sc{url}, the same as @samp{-t}.
+Set number of retries per @sc{url} -- the same as @samp{-t}.
@item use_proxy = on/off
Turn proxy support on/off. The same as @samp{-Y}.
@item verbose = on/off
-Turn verbose on/off, the same as @samp{-v}/@samp{-nv}.
+Turn verbose on/off -- the same as @samp{-v}/@samp{-nv}.
@item wait = @var{n}
-Wait @var{n} seconds between retrievals, the same as @samp{-w}.
+Wait @var{n} seconds between retrievals -- the same as @samp{-w}.
@item waitretry = @var{n}
Wait up to @var{n} seconds between retries of failed retrievals only --