@c %**start of header
@setfilename wget.info
@include version.texi
-@set UPDATED Jan 2008
+@set UPDATED Mar 2008
@settitle GNU Wget @value{VERSION} Manual
@c Disable the monstrous rectangles beside overfull hbox-es.
@finalout
@code{Content-Disposition} headers to describe what the name of a
downloaded file should be.
+@cindex authentication
+@item --auth-no-challenge
+
+If this option is given, Wget will send Basic HTTP authentication
+information (plaintext username and password) for all requests, just
+like Wget 1.10.2 and prior did by default.
+
+Use of this option is not recommended, and is intended only to support
+some few obscure servers, which never send HTTP authentication
+challenges, but accept unsolicited auth info, say, in addition to
+form-based authentication.
+
@end table
@node HTTPS (SSL/TLS) Options
expansion by the shell.
@end table
+@noindent
The @samp{-A} and @samp{-R} options may be combined to achieve even
better fine-tuning of which files to retrieve. E.g. @samp{wget -A
"*zelazny*" -R .ps} will download all the files having @samp{zelazny} as
a part of their name, but @emph{not} the PostScript files.
Note that these two options do not affect the downloading of @sc{html}
-files; Wget must load all the @sc{html}s to know where to go at
-all---recursive retrieval would make no sense otherwise.
+files (as determined by a @samp{.htm} or @samp{.html} filename
+prefix). This behavior may not be desirable for all users, and may be
+changed for future versions of Wget.
+
+Note, too, that query strings (strings at the end of a URL beginning
+with a question mark (@samp{?}) are not included as part of the
+filename for accept/reject rules, even though these will actually
+contribute to the name chosen for the local file. It is expected that
+a future version of Wget will provide an option to allow matching
+against query strings.
+
+Finally, it's worth noting that the accept/reject lists are matched
+@emph{twice} against downloaded files: once against the URL's filename
+portion, to determine if the file should be downloaded in the first
+place; then, after it has been accepted and successfully downloaded,
+the local file's name is also checked against the accept/reject lists
+to see if it should be removed. The rationale was that, since
+@samp{.htm} and @samp{.html} files are always downloaded regardless of
+accept/reject rules, they should be removed @emph{after} being
+downloaded and scanned for links, if they did match the accept/reject
+lists. However, this can lead to unexpected results, since the local
+filenames can differ from the original URL filenames in the following
+ways, all of which can change whether an accept/reject rule matches:
+
+@itemize @bullet
+@item
+If the local file already exists and @samp{--no-directories} was
+specified, a numeric suffix will be appended to the original name.
+@item
+If @samp{--html-extension} was specified, the local filename will have
+@samp{.html} appended to it. If Wget is invoked with @samp{-E -A.php},
+a filename such as @samp{index.php} will match be accepted, but upon
+download will be named @samp{index.php.html}, which no longer matches,
+and so the file will be deleted.
+@item
+Query strings do not contribute to URL matching, but are included in
+local filenames, and so @emph{do} contribute to filename matching.
+@end itemize
+
+@noindent
+This behavior, too, is considered less-than-desirable, and may change
+in a future version of Wget.
@node Directory-Based Limits
@section Directory-Based Limits
Essentially, @samp{--no-parent} is similar to
@samp{-I/~luzer/my-archive}, only it handles redirections in a more
intelligent fashion.
+
+@strong{Note} that, for HTTP (and HTTPS), the trailing slash is very
+important to @samp{--no-parent}. HTTP has no concept of a ``directory''---Wget
+relies on you to indicate what's a directory and what isn't. In
+@samp{http://foo/bar/}, Wget will consider @samp{bar} to be a
+directory, while in @samp{http://foo/bar} (no trailing slash),
+@samp{bar} will be considered a filename (so @samp{--no-parent} would be
+meaningless, as its parent is @samp{/}).
@end table
@node Relative Links
Daniel Bodea,
Mark Boyns,
John Burden,
+Julien Buty,
Wanderlei Cavassin,
Gilles Cedoc,
Tim Charron,
Ahmon Dancy,
Andrew Davison,
Bertrand Demiddelaer,
+Alexander Dergachev,
Andrew Deryabin,
Ulrich Drepper,
Marc Duponcheel,
Mauro Tortonesi,
Dave Turner,
Gisle Vanem,
+Rabin Vincent,
Russell Vincent,
@iftex
@v{Z}eljko Vrba,