@c %**start of header
@setfilename wget.info
@include version.texi
-@set UPDATED May 2003
+@set UPDATED Jan 2005
@settitle GNU Wget @value{VERSION} Manual
@c Disable the monstrous rectangles beside overfull hbox-es.
@finalout
data.
@c man begin COPYRIGHT
-Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2002, 2003 Free
-Software Foundation, Inc.
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2004, 2005
+Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@page
@vskip 0pt plus 1filll
-Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2003 Free Software
-Foundation, Inc.
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2003, 2004, 2005,
+Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2 or
This manual documents version @value{VERSION} of GNU Wget, the freely
available utility for network downloads.
-Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2003 Free Software
-Foundation, Inc.
+Copyright @copyright{} 1996, 1997, 1998, 2000, 2001, 2003, 2004, 2005
+Free Software Foundation, Inc.
@menu
* Overview:: Features of Wget.
@item
Wget supports proxy servers, which can lighten the network load, speed
up retrieval and provide access behind firewalls. However, if you are
-behind a firewall that requires that you use a socks style gateway, you
-can get the socks library and build Wget with support for socks. Wget
-also supports the passive @sc{ftp} downloading as an option.
+behind a firewall that requires that you use a socks style gateway,
+you can get the socks library and build Wget with support for socks.
+Wget uses the passive @sc{ftp} downloading by default, active @sc{ftp}
+being an option.
+
+@sp 1
+@item
+Wget supports IP version 6, the next generation of IP. IPv6 is
+autodetected at compile-time, and can be disabled at either build or
+run time. Binaries built with IPv6 support work well in both
+IPv4-only and dual family environments.
@sp 1
@item
@sp 1
@item
-The retrieval is conveniently traced with printing dots, each dot
-representing a fixed amount of data received (1KB by default). These
-representations can be customized to your preferences.
+The progress of individual downloads is traced using a progress gauge.
+Interactive downloads are tracked using a ``thermometer''-style gauge,
+whereas non-interactive ones are traced with dots, each dot
+representing a fixed amount of data received (1KB by default). Either
+gauge can be customized to your preferences.
@sp 1
@item
* Download Options::
* Directory Options::
* HTTP Options::
+* HTTPS (SSL/TLS) Options::
* FTP Options::
* Recursive Retrieval Options::
* Recursive Accept/Reject Options::
@cindex DNS cache
@cindex caching of DNS lookups
-@item --dns-cache=off
-Turn off caching of DNS lookups. Normally, Wget remembers the addresses
-it looked up from DNS so it doesn't have to repeatedly contact the DNS
-server for the same (typically small) set of addresses it retrieves
-from. This cache exists in memory only; a new Wget run will contact DNS
-again.
-
-However, in some cases it is not desirable to cache host names, even for
-the duration of a short-running application like Wget. For example,
-some HTTP servers are hosted on machines with dynamically allocated IP
-addresses that change from time to time. Their DNS entries are updated
-along with each change. When Wget's download from such a host gets
-interrupted by IP address change, Wget retries the download, but (due to
-DNS caching) it contacts the old address. With the DNS cache turned
-off, Wget will repeat the DNS lookup for every connect and will thus get
-the correct dynamic address every time---at the cost of additional DNS
-lookups where they're probably not needed.
-
-If you don't understand the above description, you probably won't need
-this option.
+@item --no-dns-cache
+Turn off caching of DNS lookups. Normally, Wget remembers the IP
+addresses it looked up from DNS so it doesn't have to repeatedly
+contact the DNS server for the same (typically small) set of hosts it
+retrieves from. This cache exists in memory only; a new Wget run will
+contact DNS again.
+
+However, it has been reported that in some situations it is not
+desirable to cache host names, even for the duration of a
+short-running application like Wget. With this option Wget issues a
+new DNS lookup (more precisely, a new call to @code{gethostbyname} or
+@code{getaddrinfo}) each time it makes a new connection. Please note
+that this option will @emph{not} affect caching that might be
+performed by the resolving library or by an external caching layer,
+such as NSCD.
+
+If you don't understand exactly what this option does, you probably
+won't need it.
@cindex file names, restrict
@cindex Windows file names
switched off. You can use @samp{--restrict-file-names=nocontrol} to
turn off escaping of control characters without affecting the choice of
the OS to use as file name restriction mode.
+
+@cindex IPv6
+@itemx -4
+@itemx --inet4-only
+@itemx -6
+@itemx --inet6-only
+Force connecting to IPv4 or IPv6 addresses. With @samp{--inet4-only}
+or @samp{-4}, Wget will only connect to IPv4 hosts, ignoring AAAA
+records in DNS, and refusing to connect to IPv6 addresses specified in
+URLs. Conversely, with @samp{--inet6-only} or @samp{-6}, Wget will
+only connect to IPv6 hosts and ignore A records and IPv4 addresses.
+
+Neither options should be needed normally. By default, an IPv6-aware
+Wget will use the address family specified by the host's DNS record.
+If the DNS specifies both an A record and an AAAA record, Wget will
+try them in sequence until it finds one it can connect to.
+
+These options can be used to deliberately force the use of IPv4 or
+IPv6 address families on dual family systems, usually to aid debugging
+or to deal with broken network configuration. Only one of
+@samp{--inet6-only} and @samp{--inet4-only} may be specified in the
+same command. Neither option is available in Wget compiled without
+IPv6 support.
+
+@item --prefer-family=IPv4/IPv6/none
+When given a choice of several addresses, connect to the addresses
+with specified address family first. IPv4 addresses are preferred by
+default.
+
+This avoids spurious errors and correct attempts when accessing hosts
+that resolve to both IPv6 and IPv4 addresses from IPv4 networks. For
+example, @samp{www.kame.net} resolves to
+@samp{2001:200:0:8002:203:47ff:fea5:3085} and to
+@samp{203.178.141.194}. When the preferred family is @code{IPv4}, the
+IPv4 address is used first; when the preferred family is @code{IPv6},
+the IPv6 address is used first; if the specified value is @code{none},
+the address order returned by DNS is used without change.
+
+Unlike @samp{-4} and @samp{-6}, this option doesn't forbid access to
+any address family, it only changes the @emph{order} in which the
+addresses are accessed. Also note that the reordering performed by
+this option is @dfn{stable}---it doesn't affect order of addresses of
+the same family. That is, the relative order of all IPv4 addresses
+and of all IPv6 addresses remains intact in all cases.
@end table
@node Directory Options
to send those cookies, bypassing the ``official'' cookie support:
@example
-wget --cookies=off --header "Cookie: @var{name}=@var{value}"
+wget --no-cookies --header "Cookie: @var{name}=@var{value}"
@end example
@cindex saving cookies
@end example
@end table
+@node HTTPS (SSL/TLS) Options
+@section HTTPS (SSL/TLS) Options
+
+@cindex SSL
+To support SSL-based HTTP (HTTPS) downloads, Wget must be compiled
+with an external SSL library, currently OpenSSL. If Wget is compiled
+without SSL support, none of these options are available.
+
+@table @samp
+@item --sslcertfile=@var{file}
+Use the client certificate stored in @var{file}. This is needed for
+servers that are configured to require certificates from the clients
+that connect to them. Normally a certificate is not required and this
+switch is optional.
+
+@cindex SSL certificate
+@item --sslcertkey=@var{keyfile}
+Read the certificate key from @var{keyfile}.
+
+@cindex SSL certificate authority
+@item --sslcadir=@var{directory}
+Specifies directory used for certificate authorities (``CA'').
+
+@item --sslcafile=@var{file}
+Use @var{file} as the file with the bundle of certificate authorities.
+
+@cindex SSL certificate type, specify
+@item --sslcerttype=0/1
+Specify the type of the client certificate: 0 means @code{PEM}
+(default), 1 means @code{ASN1} (@code{DER}).
+
+@cindex SSL certificate, check
+@item --sslcheckcert=0/1
+If set to 1, check the server certificate against the specified client
+authorities. If this is 0 (the default), Wget will break the SSL
+handshake if the server certificate is not valid.
+
+@cindex SSL protocol, choose
+@item --sslprotocol=0-3
+Choose the SSL protocol to be used. If 0 is specified (the default),
+the OpenSSL library chooses the appropriate protocol automatically.
+Specifying 1 forces the use of SSLv2, specifying 2 forces SSLv3, and
+specifying 3 forces TLSv1.
+
+In most cases the OpenSSL library is capable of making an intelligent
+choice of the protocol, but there have been reports of sites that use
+old (and presumably buggy) server libraries with which a protocol has
+to be specified manually.
+
+@cindex EGD
+@item --egd-file=@var{file}
+Use @var{file} as the EGD socket. EGD stands for @dfn{Entropy
+Gathering Daemon}, a user-space program that collects data from
+various unpredictable system sources and makes it available to other
+programs that might need it. Encryption software, such as the SSL
+library, needs sources of non-repeating randomness to seed the random
+number generator used to produce cryptographically strong keys.
+
+OpenSSL allows the user to specify his own source of entropy using the
+@code{RAND_FILE} environment variable. If this variable is unset, or
+if the specified file does not produce enough randomness, OpenSSL will
+read random data from EGD socket specified using this option.
+
+If this option is not specified (and the equivalent startup command is
+not used), EGD is never contacted. EGD is not needed on modern Unix
+systems that support @file{/dev/random}.
+@end table
+
@node FTP Options
@section FTP Options
@table @samp
+@cindex password, FTP
+@item --ftp-passwd=@var{string}
+Set the default FTP password to @var{string}. Without this, or the
+corresponding startup option, the password defaults to @samp{-wget@@},
+normally used for anonymous FTP.
+
@cindex .listing files, removing
@item --no-remove-listing
Don't remove the temporary @file{.listing} files generated by @sc{ftp}
servers (and the ones emulating Unix @code{ls} output).
@cindex passive ftp
-@item --passive-ftp
-Use the @dfn{passive} @sc{ftp} retrieval scheme, in which the client
-initiates the data connection. This is sometimes required for @sc{ftp}
-to work behind firewalls.
+@item --no-passive-ftp
+Disable the use of the @dfn{passive} FTP transfer mode. Passive FTP
+mandates that the client connect to the server to establish the data
+connection rather than the other way around.
+
+If the machine is connected to the Internet directly, both passive and
+active FTP should work equally well. Behind most firewall and NAT
+configurations passive FTP has a better chance of working. However,
+in some rare firewall configurations, active FTP actually works when
+passive FTP doesn't. If you suspect this to be the case, use this
+option, or set @code{passive_ftp=off} in your init file.
@cindex symbolic links, retrieving
@item --retr-symlinks
@item dot_spacing = @var{n}
Specify the number of dots in a single cluster (10 by default).
+@item egd_file = @var{string}
+Use @var{string} as the EGD socket file name. The same as
+@samp{--egd-file}.
+
@item exclude_directories = @var{string}
Specify a comma-separated list of directories you wish to exclude from
download---the same as @samp{-X} (@pxref{Directory-Based Limits}).
If set to on, force the input filename to be regarded as an @sc{html}
document---the same as @samp{-F}.
+@item ftp_passwd = @var{string}
+Set your @sc{ftp} password to @var{string}. Without this setting, the
+password defaults to @samp{-wget@@}, which is a useful default for
+anonymous @sc{ftp} access.
+
+This command used to be named @code{passwd} prior to Wget 1.10.
+
@item ftp_proxy = @var{string}
Use @var{string} as @sc{ftp} proxy, instead of the one specified in
environment.
Specify a comma-separated list of directories you wish to follow when
downloading---the same as @samp{-I}.
+@item inet4_only = on/off
+Force connecting to IPv4 addresses, off by default. You can put this
+in the global init file to disable Wget's attempts to resolve and
+connect to IPv6 hosts. Available only if Wget was compiled with IPv6
+support. The same as @samp{--inet4-only} or @samp{-4}.
+
+@item inet6_only = on/off
+Force connecting to IPv6 addresses, off by default. Available only if
+Wget was compiled with IPv6 support. The same as @samp{--inet6-only}
+or @samp{-6}.
+
@item input = @var{string}
Read the @sc{url}s from @var{string}, like @samp{-i}.
display properly---the same as @samp{-p}.
@item passive_ftp = on/off/always/never
-Set passive @sc{ftp}---the same as @samp{--passive-ftp}. Some scripts
-and @samp{.pm} (Perl module) files download files using @samp{wget
---passive-ftp}. If your firewall does not allow this, you can set
-@samp{passive_ftp = never} to override the command-line.
-
-@item passwd = @var{string}
-Set your @sc{ftp} password to @var{password}. Without this setting, the
-password defaults to @samp{username@@hostname.domainname}.
+Change setting of passive @sc{ftp}, equivalent to the
+@samp{--passive-ftp} option. Some scripts and @samp{.pm} (Perl
+module) files download files using @samp{wget --passive-ftp}. If your
+firewall does not allow this, you can set @samp{passive_ftp = never}
+to override the command-line.
@item post_data = @var{string}
Use POST as the method for all HTTP requests and send @var{string} in
Use POST as the method for all HTTP requests and send the contents of
@var{file} in the request body. The same as @samp{--post-file}.
+@item prefer_family = IPv4/IPv6/none
+When given a choice of several addresses, connect to the addresses
+with specified address family first. IPv4 addresses are preferred by
+default. The same as @samp{--prefer-family}, which see for a detailed
+discussion of why this is useful.
+
@item progress = @var{string}
Set the type of the progress indicator. Legal types are ``dot'' and
``bar''.
@item proxy_passwd = @var{string}
Set proxy authentication password to @var{string}, like @samp{--proxy-passwd}.
-@item referer = @var{string}
-Set HTTP @samp{Referer:} header just like @samp{--referer}. (Note it
-was the folks who wrote the @sc{http} spec who got the spelling of
-``referrer'' wrong.)
-
@item quiet = on/off
Quiet mode---the same as @samp{-q}.
@item recursive = on/off
Recursive on/off---the same as @samp{-r}.
+@item referer = @var{string}
+Set HTTP @samp{Referer:} header just like @samp{--referer}. (Note it
+was the folks who wrote the @sc{http} spec who got the spelling of
+``referrer'' wrong.)
+
@item relative_only = on/off
Follow only relative links---the same as @samp{-L} (@pxref{Relative
Links}).
@item span_hosts = on/off
Same as @samp{-H}.
+@item ssl_cert_file = @var{string}
+Set the client certificate file name to @var{string}. The same as
+@samp{--sslcertfile}.
+
+@item ssl_cert_key = @var{string}
+Set the certificate key file to @var{string}. The same as
+@samp{--sslcertkey}.
+
+@item ssl_ca_dir = @var{string}
+Set the directory used for certificate authorities. The same as
+@samp{--sslcadir}.
+
+@item ssl_ca_file = @var{string}
+Set the certificate authority bundle file to @var{string}. The same
+as @samp{--sslcafile}.
+
+@item ssl_cert_type = 0/1
+Specify the type of the client certificate: 0 means @code{PEM}
+(default), 1 means @code{ASN1} (@code{DER}). The same as
+@samp{--sslcerttype}.
+
+@item ssl_check_cert = 0/1
+If this is set to 1, the server certificate is checked against the
+specified client authorities. The same as @samp{--sslcheckcert}.
+
+@item ssl_protocol = 0-3
+Choose the SSL protocol to be used. 0 means choose automatically, 1
+means force SSLv2, 2 means force SSLv3, and 3 means force TLSv1. The
+same as @samp{--sslprotocol}.
+
@item strict_comments = on/off
Same as @samp{--strict-comments}.
@cindex mailing list
@cindex list
-Wget has its own mailing list at @email{wget@@sunsite.dk}, thanks
-to Karsten Thygesen. The mailing list is for discussion of Wget
-features and web, reporting Wget bugs (those that you think may be of
-interest to the public) and mailing announcements. You are welcome to
-subscribe. The more people on the list, the better!
+There are several Wget-related mailing lists, all hosted by
+SunSITE.dk. The general discussion list is at
+@email{wget@@sunsite.dk}. It is the preferred place for bug reports
+and suggestions, as well as for discussion of development. You are
+invited to subscribe.
+
+To subscribe, simply send mail to @email{wget-subscribe@@sunsite.dk}
+and follow the instructions. Unsubscribe by mailing to
+@email{wget-unsubscribe@@sunsite.dk}. The mailing list is archived at
+@url{http://www.mail-archive.com/wget%40sunsite.dk/} and at
+@url{http://news.gmane.org/gmane.comp.web.wget.general}.
+
+The second mailing list is at @email{wget-patches@@sunsite.dk}, and is
+used to submit patches for review by Wget developers. A ``patch'' is
+a textual representation of change to source code, readable by both
+humans and programs. The file @file{PATCHES} that comes with Wget
+covers the creation and submitting of patches in detail. Please don't
+send general suggestions or bug reports to @samp{wget-patches}; use it
+only for patch submissions.
+
+To subscribe, simply send mail to @email{wget-subscribe@@sunsite.dk}
+and follow the instructions. Unsubscribe by mailing to
+@email{wget-unsubscribe@@sunsite.dk}. The mailing list is archived at
+@url{http://news.gmane.org/gmane.comp.web.wget.patches}.
+
+Finally, there is a read-only list at @email{wget-cvs@@sunsite.dk}
+that tracks commits to the Wget CVS repository. To subscribe to that
+list, send mail to @email{wget-cvs-subscribe@@sunsite.dk}. The list
+is not archived.
-To subscribe, simply send mail to @email{wget-subscribe@@sunsite.dk}.
-Unsubscribe by mailing to @email{wget-unsubscribe@@sunsite.dk}.
-
-The mailing list is archived at @url{http://fly.srk.fer.hr/archive/wget}.
-Alternative archive is available at
-@url{http://www.mail-archive.com/wget%40sunsite.auc.dk/}.
-
@node Reporting Bugs
@section Reporting Bugs
@cindex bugs
the file.
@item
-Please start Wget with @samp{-d} option and send the log (or the
-relevant parts of it). If Wget was compiled without debug support,
-recompile it. It is @emph{much} easier to trace bugs with debug support
-on.
+Please start Wget with @samp{-d} option and send us the resulting
+output (or relevant parts thereof). If Wget was compiled without
+debug support, recompile it---it is @emph{much} easier to trace bugs
+with debug support on.
+
+Note: please make sure to remove any potentially sensitive information
+from the debug log before sending it to the bug address. The
+@code{-d} won't go out of its way to collect sensitive information,
+but the log @emph{will} contain a fairly complete transcript of Wget's
+communication with the server, which may include passwords and pieces
+of downloaded data. Since the bug address is publically archived, you
+may assume that all bug reports are visible to the public.
@item
If Wget has crashed, try to run it in a debugger, e.g. @code{gdb `which
-wget` core} and type @code{where} to get the backtrace.
+wget` core} and type @code{where} to get the backtrace. This may not
+work if the system administrator has disabled core files, but it is
+safe to try.
@end enumerate
@c man end
``special'' features of any particular Unix, it should compile (and
work) on all common Unix flavors.
-Various Wget versions have been compiled and tested under many kinds of
-Unix systems, including Solaris, GNU/Linux, SunOS, OSF (aka Digital Unix
-or Tru64), Ultrix, *BSD, IRIX, AIX, and others; refer to the file
-@file{MACHINES} in the distribution directory for a comprehensive list.
-If you compile it on an architecture not listed there, please let me
-know so I can update it.
-
-Wget should also compile on the other Unix systems, not listed in
-@file{MACHINES}. If it doesn't, please let me know.
-
-Thanks to kind contributors, this version of Wget compiles and works on
-Microsoft Windows 95 and Windows NT platforms. It has been compiled
-successfully using MS Visual C++ 6.0, Watcom, and Borland C compilers,
-with Winsock as networking software. Naturally, it is crippled of some
-features available on Unix, but it should work as a substitute for
-people stuck with Windows. Note that the Windows port is
-@strong{neither tested nor maintained} by me---all questions and
-problems in Windows usage should be reported to Wget mailing list at
+Various Wget versions have been compiled and tested under many kinds
+of Unix systems, including GNU/Linux, Solaris, SunOS 4.x, OSF (aka
+Digital Unix or Tru64), Ultrix, *BSD, IRIX, AIX, and others. Some of
+those systems are no longer in widespread use and may not be able to
+support recent versions of Wget. If Wget fails to compile on your
+system, we would like to know about it.
+
+Thanks to kind contributors, this version of Wget compiles and works
+on 32-bit Microsoft Windows platforms. It has been compiled
+successfully using MS Visual C++ 6.0, Watcom, Borland C, and GCC
+compilers. Naturally, it is crippled of some features available on
+Unix, but it should work as a substitute for people stuck with
+Windows. Note that Windows-specific portions of Wget are not
+guaranteed to be supported in the future, although this has been the
+case in practice for many years now. All questions and problems in
+Windows usage should be reported to Wget mailing list at
@email{wget@@sunsite.dk} where the volunteers who maintain the
Windows-related features might look at them.
to redirect the output of Wget after having started it.
@example
-$ wget http://www.ifi.uio.no/~larsi/gnus.tar.gz &
-$ kill -HUP %% # Redirect the output to wget-log
+$ wget http://www.gnus.org/dist/gnus.tar.gz &
+...
+$ kill -HUP %%
+SIGHUP received, redirecting output to `wget-log'.
@end example
Other than that, Wget will not try to interfere with signals in any way.
Damir Dzeko,
@end ifnottex
Alan Eldridge,
+Hans-Andreas Engel,
@iftex
Aleksandar Erkalovi@'{c},
@end iftex
Andy Eskilsson,
Christian Fraenkel,
David Fritz,
+FUJISHIMA Satsuki,
Masashi Fujita,
Howard Gayle,
Marcel Gerrits,
Jochen Hein,
Karl Heuer,
HIROSE Masaaki,
+Ulf Harnhammar,
Gregor Hoffleit,
Erik Magnus Hulthen,
Richard Huveneers,
Jonas Jensen,
+Larry Jones,
Simon Josefsson,
@iftex
Mario Juri@'{c},
Aurelien Marchand,
Jordan Mendelson,
Lin Zhe Min,
+Jan Minar,
Tim Mooney,
Simon Munton,
Charlie Negyesi,
R. K. Owen,
+Leonid Petrov,
+Simone Piunno,
Andrew Pollock,
Steve Pothier,
@iftex
Dave Turner,
Gisle Vanem,
Russell Vincent,
+@iftex
+@v{Z}eljko Vrba,
+@end iftex
+@ifnottex
+Zeljko Vrba,
+@end ifnottex
Charles G Waldman,
Douglas E. Wegscheid,
+YAMAZAKI Makoto,
Jasmin Zainul,
@iftex
Bojan @v{Z}drnja,
@smallexample
@var{one line to give the program's name and an idea of what it does.}
-Copyright (C) 19@var{yy} @var{name of author}
+Copyright (C) 20@var{yy} @var{name of author}
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
when it starts in an interactive mode:
@smallexample
-Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
+Gnomovision version 69, Copyright (C) 20@var{yy} @var{name of author}
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
type `show w'. This is free software, and you are welcome
to redistribute it under certain conditions; type `show c'