@c %**start of header
@setfilename wget.info
@include version.texi
-@set UPDATED Jan 2008
+@set UPDATED Mar 2008
@settitle GNU Wget @value{VERSION} Manual
@c Disable the monstrous rectangles beside overfull hbox-es.
@finalout
* Wget: (wget). The non-interactive network downloader.
@end direntry
-@ifnottex
+@copying
This file documents the GNU Wget utility for downloading network
data.
Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002,
2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
+@iftex
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
+@end iftex
@ignore
Permission is granted to process this file through TeX and print the
copy of the license is included in the section entitled ``GNU Free
Documentation License''.
@c man end
-@end ifnottex
+@end copying
@titlepage
@title GNU Wget @value{VERSION}
@page
@vskip 0pt plus 1filll
-Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002,
-2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
-
-Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.2 or
-any later version published by the Free Software Foundation; with no
-Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
-copy of the license is included in the section entitled ``GNU Free
-Documentation License''.
+@insertcopying
@end titlepage
+@contents
+
@ifnottex
@node Top
@top Wget @value{VERSION}
-This manual documents version @value{VERSION} of GNU Wget, the freely
-available utility for network downloads.
-
-Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002,
-2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
+@insertcopying
+@end ifnottex
@menu
* Overview:: Features of Wget.
* Copying this manual:: You may give out copies of Wget and of this manual.
* Concept Index:: Topics covered by this manual.
@end menu
-@end ifnottex
@node Overview
@chapter Overview
@item
Wget supports proxy servers, which can lighten the network load, speed
-up retrieval and provide access behind firewalls. However, if you are
-behind a firewall that requires that you use a socks style gateway,
-you can get the socks library and build Wget with support for socks.
-Wget uses the passive @sc{ftp} downloading by default, active @sc{ftp}
-being an option.
+up retrieval and provide access behind firewalls. Wget uses the passive
+@sc{ftp} downloading by default, active @sc{ftp} being an option.
@item
Wget supports IP version 6, the next generation of IP. IPv6 is
@samp{wget -O - http://foo > file}; @file{file} will be truncated
immediately, and @emph{all} downloaded content will be written there.
+For this reason, @samp{-N} (for timestamp-checking) is not supported
+in combination with @samp{-O}: since @var{file} is always newly
+created, it will always have a very new timestamp. A warning will be
+issued if this combination is used.
+
+Similarly, using @samp{-r} or @samp{-p} with @samp{-O} may not work as
+you expect: Wget won't just download the first file to @var{file} and
+then download the rest to their normal names: @emph{all} downloaded
+content will be placed in @var{file}. This was disabled in version
+1.11, but has been reinstated (with a warning) in 1.11.2, as there are
+some cases where this behavior can actually have some use.
+
Note that a combination with @samp{-k} is only permitted when
-downloading a single document, and combination with any of @samp{-r},
-@samp{-p}, or @samp{-N} is not allowed.
+downloading a single document, as in that case it will just convert
+all relative URIs to external ones; @samp{-k} makes no sense for
+multiple URIs when they're all being downloaded to a single file.
@cindex clobbering, file
@cindex downloading multiple times
same time. Neither option is available in Wget compiled without IPv6
support.
-@item --prefer-family=IPv4/IPv6/none
+@item --prefer-family=none/IPv4/IPv6
When given a choice of several addresses, connect to the addresses
-with specified address family first. IPv4 addresses are preferred by
-default.
+with specified address family first. The address order returned by
+DNS is used without change by default.
This avoids spurious errors and connect attempts when accessing hosts
that resolve to both IPv6 and IPv4 addresses from IPv4 networks. For
using the @samp{--ftp-user} and @samp{--ftp-password} options for
@sc{ftp} connections and the @samp{--http-user} and @samp{--http-password}
options for @sc{http} connections.
+
+@item --ask-password
+Prompt for a password for each connection established. Cannot be specified
+when @samp{--password} is being used, because they are mutually exclusive.
@end table
@node Directory Options
@section Directory Options
-@table @samp
+@table @samp
@item -nd
@itemx --no-directories
Do not create a hierarchy of directories when retrieving recursively.
@code{Content-Disposition} headers to describe what the name of a
downloaded file should be.
+@cindex authentication
+@item --auth-no-challenge
+
+If this option is given, Wget will send Basic HTTP authentication
+information (plaintext username and password) for all requests, just
+like Wget 1.10.2 and prior did by default.
+
+Use of this option is not recommended, and is intended only to support
+some few obscure servers, which never send HTTP authentication
+challenges, but accept unsolicited auth info, say, in addition to
+form-based authentication.
+
@end table
@node HTTPS (SSL/TLS) Options
expansion by the shell.
@end table
+@noindent
The @samp{-A} and @samp{-R} options may be combined to achieve even
better fine-tuning of which files to retrieve. E.g. @samp{wget -A
"*zelazny*" -R .ps} will download all the files having @samp{zelazny} as
a part of their name, but @emph{not} the PostScript files.
Note that these two options do not affect the downloading of @sc{html}
-files; Wget must load all the @sc{html}s to know where to go at
-all---recursive retrieval would make no sense otherwise.
+files (as determined by a @samp{.htm} or @samp{.html} filename
+prefix). This behavior may not be desirable for all users, and may be
+changed for future versions of Wget.
+
+Note, too, that query strings (strings at the end of a URL beginning
+with a question mark (@samp{?}) are not included as part of the
+filename for accept/reject rules, even though these will actually
+contribute to the name chosen for the local file. It is expected that
+a future version of Wget will provide an option to allow matching
+against query strings.
+
+Finally, it's worth noting that the accept/reject lists are matched
+@emph{twice} against downloaded files: once against the URL's filename
+portion, to determine if the file should be downloaded in the first
+place; then, after it has been accepted and successfully downloaded,
+the local file's name is also checked against the accept/reject lists
+to see if it should be removed. The rationale was that, since
+@samp{.htm} and @samp{.html} files are always downloaded regardless of
+accept/reject rules, they should be removed @emph{after} being
+downloaded and scanned for links, if they did match the accept/reject
+lists. However, this can lead to unexpected results, since the local
+filenames can differ from the original URL filenames in the following
+ways, all of which can change whether an accept/reject rule matches:
+
+@itemize @bullet
+@item
+If the local file already exists and @samp{--no-directories} was
+specified, a numeric suffix will be appended to the original name.
+@item
+If @samp{--html-extension} was specified, the local filename will have
+@samp{.html} appended to it. If Wget is invoked with @samp{-E -A.php},
+a filename such as @samp{index.php} will match be accepted, but upon
+download will be named @samp{index.php.html}, which no longer matches,
+and so the file will be deleted.
+@item
+Query strings do not contribute to URL matching, but are included in
+local filenames, and so @emph{do} contribute to filename matching.
+@end itemize
+
+@noindent
+This behavior, too, is considered less-than-desirable, and may change
+in a future version of Wget.
@node Directory-Based Limits
@section Directory-Based Limits
Essentially, @samp{--no-parent} is similar to
@samp{-I/~luzer/my-archive}, only it handles redirections in a more
intelligent fashion.
+
+@strong{Note} that, for HTTP (and HTTPS), the trailing slash is very
+important to @samp{--no-parent}. HTTP has no concept of a ``directory''---Wget
+relies on you to indicate what's a directory and what isn't. In
+@samp{http://foo/bar/}, Wget will consider @samp{bar} to be a
+directory, while in @samp{http://foo/bar} (no trailing slash),
+@samp{bar} will be considered a filename (so @samp{--no-parent} would be
+meaningless, as its parent is @samp{/}).
@end table
@node Relative Links
@var{file} in the request body. The same as
@samp{--post-file=@var{file}}.
-@item prefer_family = IPv4/IPv6/none
+@item prefer_family = none/IPv4/IPv6
When given a choice of several addresses, connect to the addresses
-with specified address family first. IPv4 addresses are preferred by
-default. The same as @samp{--prefer-family}, which see for a detailed
-discussion of why this is useful.
+with specified address family first. The address order returned by
+DNS is used without change by default. The same as @samp{--prefer-family},
+which see for a detailed discussion of why this is useful.
@item private_key = @var{file}
Set the private key file to @var{file}. The same as
@url{http://news.gmane.org/gmane.comp.web.wget.patches}.
Finally, there is the @email{wget-notify@@addictivecode.org} mailing
-list. This is a non-discussion list that receives commit notifications
-from the source repository, and also bug report-change notifications.
-This is the highest-traffic list for Wget, and is recommended only for
-people who are seriously interested in ongoing Wget development.
-Subscription is through the @code{mailman} interface at
+list. This is a non-discussion list that receives bug report-change
+notifications from the bug-tracker. Unlike for the other mailing lists,
+subscription is through the @code{mailman} interface at
@url{http://addictivecode.org/mailman/listinfo/wget-notify}.
@node Internet Relay Chat
@cindex IRC
@cindex #wget
-While, at the time of this writing, there is very low activity, we do
-have a support channel set up via IRC at @code{irc.freenode.org},
-@code{#wget}. Come check it out!
+In addition to the mailinglists, we also have a support channel set up
+via IRC at @code{irc.freenode.org}, @code{#wget}. Come check it out!
@node Reporting Bugs
@section Reporting Bugs
Daniel Bodea,
Mark Boyns,
John Burden,
+Julien Buty,
Wanderlei Cavassin,
Gilles Cedoc,
Tim Charron,
Ahmon Dancy,
Andrew Davison,
Bertrand Demiddelaer,
+Alexander Dergachev,
Andrew Deryabin,
Ulrich Drepper,
Marc Duponcheel,
Mauro Tortonesi,
Dave Turner,
Gisle Vanem,
+Rabin Vincent,
Russell Vincent,
@iftex
@v{Z}eljko Vrba,