@end ifnottex
@menu
-* Overview:: Features of Wget.
-* Invoking:: Wget command-line arguments.
-* Recursive Download:: Downloading interlinked pages.
-* Following Links:: The available methods of chasing links.
-* Time-Stamping:: Mirroring according to time-stamps.
-* Startup File:: Wget's initialization file.
-* Examples:: Examples of usage.
-* Various:: The stuff that doesn't fit anywhere else.
-* Appendices:: Some useful references.
-* Copying this manual:: You may give out copies of Wget and of this manual.
-* Concept Index:: Topics covered by this manual.
+* Overview:: Features of Wget.
+* Invoking:: Wget command-line arguments.
+* Recursive Download:: Downloading interlinked pages.
+* Following Links:: The available methods of chasing links.
+* Time-Stamping:: Mirroring according to time-stamps.
+* Startup File:: Wget's initialization file.
+* Examples:: Examples of usage.
+* Various:: The stuff that doesn't fit anywhere else.
+* Appendices:: Some useful references.
+* Copying this manual:: You may give out copies of this manual.
+* Concept Index:: Topics covered by this manual.
@end menu
@node Overview, Invoking, Top, Top
the command line.
@menu
-* URL Format::
-* Option Syntax::
-* Basic Startup Options::
-* Logging and Input File Options::
-* Download Options::
-* Directory Options::
-* HTTP Options::
-* HTTPS (SSL/TLS) Options::
-* FTP Options::
-* Recursive Retrieval Options::
-* Recursive Accept/Reject Options::
+* URL Format::
+* Option Syntax::
+* Basic Startup Options::
+* Logging and Input File Options::
+* Download Options::
+* Directory Options::
+* HTTP Options::
+* HTTPS (SSL/TLS) Options::
+* FTP Options::
+* Recursive Retrieval Options::
+* Recursive Accept/Reject Options::
+* Exit Status::
@end menu
@node URL Format, Option Syntax, Invoking, Invoking
wget -drc @var{URL}
@end example
-This is a complete equivalent of:
+This is completely equivalent to:
@example
wget -d -r -c @var{URL}
@samp{--no-} prefix. This might seem superfluous---if the default for
an affirmative option is to not do something, then why provide a way
to explicitly turn it off? But the startup file may in fact change
-the default. For instance, using @code{follow_ftp = off} in
-@file{.wgetrc} makes Wget @emph{not} follow FTP links by default, and
+the default. For instance, using @code{follow_ftp = on} in
+@file{.wgetrc} makes Wget @emph{follow} FTP links by default, and
using @samp{--no-follow-ftp} is the only way to restore the factory
default from the command line.
If this function is used, no @sc{url}s need be present on the command
line. If there are @sc{url}s both on the command line and in an input
file, those on the command lines will be the first ones to be
-retrieved. The @var{file} need not be an @sc{html} document (but no
-harm if it is)---it is enough if the @sc{url}s are just listed
-sequentially.
+retrieved. If @samp{--force-html} is not specified, then @var{file}
+should consist of a series of URLs, one per line.
However, if you specify @samp{--force-html}, the document will be
regarded as @samp{html}. In that case you may have problems with
@cindex base for relative links in input file
@item -B @var{URL}
@itemx --base=@var{URL}
-Prepends @var{URL} to relative links read from the file specified with
-the @samp{-i} option.
+Resolves relative links using @var{URL} as the point of reference,
+when reading links from an HTML file specified via the
+@samp{-i}/@samp{--input-file} option (together with
+@samp{--force-html}, or when the input file was fetched remotely from
+a server describing it as @sc{html}). This is equivalent to the
+presence of a @code{BASE} tag in the @sc{html} input file, with
+@var{URL} as the value for the @code{href} attribute.
+
+For instance, if you specify @samp{http://foo/bar/a.html} for
+@var{URL}, and Wget reads @samp{../baz/b.html} from the input file, it
+would be resolved to @samp{http://foo/baz/b.html}.
@end table
@node Download Options, Directory Options, Logging and Input File Options, Invoking
cases, the local file will be @dfn{clobbered}, or overwritten, upon
repeated download. In other cases it will be preserved.
-When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or @samp{p},
-downloading the same file in the same directory will result in the
-original copy of @var{file} being preserved and the second copy being
-named @samp{@var{file}.1}. If that file is downloaded yet again, the
-third copy will be named @samp{@var{file}.2}, and so on. When
-@samp{-nc} is specified, this behavior is suppressed, and Wget will
-refuse to download newer copies of @samp{@var{file}}. Therefore,
-``@code{no-clobber}'' is actually a misnomer in this mode---it's not
-clobbering that's prevented (as the numeric suffixes were already
-preventing clobbering), but rather the multiple version saving that's
-prevented.
-
-When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N}
-or @samp{-nc}, re-downloading a file will result in the new copy
-simply overwriting the old. Adding @samp{-nc} will prevent this
-behavior, instead causing the original version to be preserved and any
-newer copies on the server to be ignored.
+When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or
+@samp{-p}, downloading the same file in the same directory will result
+in the original copy of @var{file} being preserved and the second copy
+being named @samp{@var{file}.1}. If that file is downloaded yet
+again, the third copy will be named @samp{@var{file}.2}, and so on.
+(This is also the behavior with @samp{-nd}, even if @samp{-r} or
+@samp{-p} are in effect.) When @samp{-nc} is specified, this behavior
+is suppressed, and Wget will refuse to download newer copies of
+@samp{@var{file}}. Therefore, ``@code{no-clobber}'' is actually a
+misnomer in this mode---it's not clobbering that's prevented (as the
+numeric suffixes were already preventing clobbering), but rather the
+multiple version saving that's prevented.
+
+When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N},
+@samp{-nd}, or @samp{-nc}, re-downloading a file will result in the
+new copy simply overwriting the old. Adding @samp{-nc} will prevent
+this behavior, instead causing the original version to be preserved
+and any newer copies on the server to be ignored.
When running Wget with @samp{-N}, with or without @samp{-r} or
@samp{-p}, the decision as to whether or not to download a newer copy
Note that @samp{-c} only works with @sc{ftp} servers and with @sc{http}
servers that support the @code{Range} header.
-@cindex iri support
-@cindex idn support
-@item --iri
-
-Turn on internationalized URI (IRI) support. Use @samp{--iri=no} to
-turn it off. IRI support is activated by default.
-
-You can set the default state of IRI support using @code{iri} command in
-@file{.wgetrc}. That setting may be overridden from the command line.
-
-@cindex local encoding
-@cindex locale
-@item --locale=@var{encoding}
-
-Force Wget to use @var{encoding} as the default system encoding. That affects
-how Wget converts URLs specified as arguments from locale to @sc{utf-8} for
-IRI support.
-
-Wget use the function @code{nl_langinfo()} and then the @code{CHARSET}
-environment variable to get the locale. If it fails, @sc{ascii} is used.
-
-You can set the default locale using the @code{locale} command in
-@file{.wgetrc}. That setting may be overridden from the command line.
-
@cindex progress indicator
@cindex dot style
@item --progress=@var{type}
``dot'' progress will be favored over ``bar''. To force the bar output,
use @samp{--progress=bar:force}.
-@cindex remote encoding
-@item --remote-encoding=@var{encoding}
-
-Force Wget to use encoding as the default remote server encoding. That
-affects how Wget converts URIs found in files from remote encoding to
-@sc{utf-8} during a recursive fetch. This options is only useful for
-IRI support, for the interpretation of non-@sc{ascii} characters.
-
-For HTTP, remote encoding can be found in HTTP @code{Content-Type}
-header and in HTML @code{Content-Type http-equiv} meta tag.
-
-You can set the default encoding using the @code{remoteencoding}
-command in @file{.wgetrc}. That setting may be overridden from the
-command line.
-
@item -N
@itemx --timestamping
Turn on time-stamping. @xref{Time-Stamping}, for details.
@cindex file names, restrict
@cindex Windows file names
-@item --restrict-file-names=@var{mode}
-Change which characters found in remote URLs may show up in local file
-names generated from those URLs. Characters that are @dfn{restricted}
+@item --restrict-file-names=@var{modes}
+Change which characters found in remote URLs must be escaped during
+generation of local filenames. Characters that are @dfn{restricted}
by this option are escaped, i.e. replaced with @samp{%HH}, where
@samp{HH} is the hexadecimal number that corresponds to the restricted
-character.
-
-By default, Wget escapes the characters that are not valid as part of
-file names on your operating system, as well as control characters that
-are typically unprintable. This option is useful for changing these
-defaults, either because you are downloading to a non-native partition,
-or because you want to disable escaping of the control characters.
-
-When mode is set to ``unix'', Wget escapes the character @samp{/} and
+character. This option may also be used to force all alphabetical
+cases to be either lower- or uppercase.
+
+By default, Wget escapes the characters that are not valid or safe as
+part of file names on your operating system, as well as control
+characters that are typically unprintable. This option is useful for
+changing these defaults, perhaps because you are downloading to a
+non-native partition, or because you want to disable escaping of the
+control characters, or you want to further restrict characters to only
+those in the @sc{ascii} range of values.
+
+The @var{modes} are a comma-separated set of text values. The
+acceptable values are @samp{unix}, @samp{windows}, @samp{nocontrol},
+@samp{ascii}, @samp{lowercase}, and @samp{uppercase}. The values
+@samp{unix} and @samp{windows} are mutually exclusive (one will
+override the other), as are @samp{lowercase} and
+@samp{uppercase}. Those last are special cases, as they do not change
+the set of characters that would be escaped, but rather force local
+file paths to be converted either to lower- or uppercase.
+
+When ``unix'' is specified, Wget escapes the character @samp{/} and
the control characters in the ranges 0--31 and 128--159. This is the
-default on Unix-like OS'es.
+default on Unix-like operating systems.
-When mode is set to ``windows'', Wget escapes the characters @samp{\},
+When ``windows'' is given, Wget escapes the characters @samp{\},
@samp{|}, @samp{/}, @samp{:}, @samp{?}, @samp{"}, @samp{*}, @samp{<},
@samp{>}, and the control characters in the ranges 0--31 and 128--159.
In addition to this, Wget in Windows mode uses @samp{+} instead of
saved as @samp{www.xemacs.org+4300/search.pl@@input=blah} in Windows
mode. This mode is the default on Windows.
-If you append @samp{,nocontrol} to the mode, as in
-@samp{unix,nocontrol}, escaping of the control characters is also
-switched off. You can use @samp{--restrict-file-names=nocontrol} to
-turn off escaping of control characters without affecting the choice of
-the OS to use as file name restriction mode.
+If you specify @samp{nocontrol}, then the escaping of the control
+characters is also switched off. This option may make sense
+when you are downloading URLs whose names contain UTF-8 characters, on
+a system which can save and display filenames in UTF-8 (some possible
+byte values used in UTF-8 byte sequences fall in the range of values
+designated by Wget as ``controls'').
+
+The @samp{ascii} mode is used to specify that any bytes whose values
+are outside the range of @sc{ascii} characters (that is, greater than
+127) shall be escaped. This can be useful when saving filenames
+whose encoding does not match the one used locally.
@cindex IPv6
@itemx -4
@item --ask-password
Prompt for a password for each connection established. Cannot be specified
when @samp{--password} is being used, because they are mutually exclusive.
+
+@cindex iri support
+@cindex idn support
+@item --no-iri
+
+Turn off internationalized URI (IRI) support. Use @samp{--iri} to
+turn it on. IRI support is activated by default.
+
+You can set the default state of IRI support using the @code{iri}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
+
+@cindex local encoding
+@item --local-encoding=@var{encoding}
+
+Force Wget to use @var{encoding} as the default system encoding. That affects
+how Wget converts URLs specified as arguments from locale to @sc{utf-8} for
+IRI support.
+
+Wget use the function @code{nl_langinfo()} and then the @code{CHARSET}
+environment variable to get the locale. If it fails, @sc{ascii} is used.
+
+You can set the default local encoding using the @code{local_encoding}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
+
+@cindex remote encoding
+@item --remote-encoding=@var{encoding}
+
+Force Wget to use @var{encoding} as the default remote server encoding.
+That affects how Wget converts URIs found in files from remote encoding
+to @sc{utf-8} during a recursive fetch. This options is only useful for
+IRI support, for the interpretation of non-@sc{ascii} characters.
+
+For HTTP, remote encoding can be found in HTTP @code{Content-Type}
+header and in HTML @code{Content-Type http-equiv} meta tag.
+
+You can set the default encoding using the @code{remoteencoding}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
@end table
@node Directory Options, HTTP Options, Download Options, Invoking
URLs that end in a slash), instead of @file{index.html}.
@cindex .html extension
+@cindex .css extension
@item -E
-@itemx --html-extension
+@itemx --adjust-extension
If a file of type @samp{application/xhtml+xml} or @samp{text/html} is
downloaded and the URL does not end with the regexp
@samp{\.[Hh][Tt][Mm][Ll]?}, this option will cause the suffix @samp{.html}
Retrieval Options}).
As of version 1.12, Wget will also ensure that any downloaded files of
-type @samp{text/css} end in the suffix @samp{.css}. Obviously, this
-makes the name @samp{--html-extension} misleading; a better name is
-expected to be offered as an alternative in the near future.
+type @samp{text/css} end in the suffix @samp{.css}, and the option was
+renamed from @samp{--html-extension}, to better reflect its new
+behavior. The old option name is still acceptable, but should now be
+considered deprecated.
+
+At some point in the future, this option may well be expanded to
+include suffixes for other types of content, including content types
+that are not parsed by Wget.
@cindex http user
@cindex http password
@cindex POST
@item --post-data=@var{string}
@itemx --post-file=@var{file}
-Use POST as the method for all HTTP requests and send the specified data
-in the request body. @code{--post-data} sends @var{string} as data,
-whereas @code{--post-file} sends the contents of @var{file}. Other than
-that, they work in exactly the same way.
+Use POST as the method for all HTTP requests and send the specified
+data in the request body. @samp{--post-data} sends @var{string} as
+data, whereas @samp{--post-file} sends the contents of @var{file}.
+Other than that, they work in exactly the same way. In particular,
+they @emph{both} expect content of the form @code{key1=value1&key2=value2},
+with percent-encoding for special characters; the only difference is
+that one expects its content as a command-line paramter and the other
+accepts its content from a file. In particular, @samp{--post-file} is
+@emph{not} for transmitting files as form attachments: those must
+appear as @code{key=value} data (with appropriate percent-coding) just
+like everything else. Wget does not currently support
+@code{multipart/form-data} for transmitting POST data; only
+@code{application/x-www-form-urlencoded}. Only one of
+@samp{--post-data} and @samp{--post-file} should be specified.
Please be aware that Wget needs to know the size of the POST data in
advance. Therefore the argument to @code{--post-file} must be a regular
option to turn it on.
@end table
-@node Recursive Accept/Reject Options, , Recursive Retrieval Options, Invoking
+@node Recursive Accept/Reject Options, Exit Status, Recursive Retrieval Options, Invoking
@section Recursive Accept/Reject Options
@table @samp
@c man end
+@node Exit Status, , Recursive Accept/Reject Options, Invoking
+@section Exit Status
+
+@c man begin EXITSTATUS
+
+Wget may return one of several error codes if it encounters problems.
+
+
+@table @asis
+@item 0
+No problems occurred.
+
+@item 1
+Generic error code.
+
+@item 2
+Parse error---for instance, when parsing command-line options, the
+@samp{.wgetrc} or @samp{.netrc}...
+
+@item 3
+File I/O error.
+
+@item 4
+Network failure.
+
+@item 5
+SSL verification failure.
+
+@item 6
+Username/password authentication failure.
+
+@item 7
+Protocol errors.
+
+@item 8
+Server issued an error response.
+@end table
+
+
+With the exceptions of 0 and 1, the lower-numbered exit codes take
+precedence over higher-numbered ones, when multiple types of errors
+are encountered.
+
+In versions of Wget prior to 1.12, Wget's exit status tended to be
+unhelpful and inconsistent. Recursive downloads would virtually always
+return 0 (success), regardless of any issues encountered, and
+non-recursive fetches only returned the status corresponding to the
+most recently-attempted download.
+
+@c man end
+
@node Recursive Download, Following Links, Invoking, Top
@chapter Recursive Download
@cindex recursion
links it will follow.
@menu
-* Spanning Hosts:: (Un)limiting retrieval based on host name.
-* Types of Files:: Getting only certain files.
-* Directory-Based Limits:: Getting only certain directories.
-* Relative Links:: Follow relative links only.
-* FTP Links:: Following FTP links.
+* Spanning Hosts:: (Un)limiting retrieval based on host name.
+* Types of Files:: Getting only certain files.
+* Directory-Based Limits:: Getting only certain directories.
+* Relative Links:: Follow relative links only.
+* FTP Links:: Following FTP links.
@end menu
@node Spanning Hosts, Types of Files, Following Links, Following Links
If the local file already exists and @samp{--no-directories} was
specified, a numeric suffix will be appended to the original name.
@item
-If @samp{--html-extension} was specified, the local filename will have
+If @samp{--adjust-extension} was specified, the local filename might have
@samp{.html} appended to it. If Wget is invoked with @samp{-E -A.php},
a filename such as @samp{index.php} will match be accepted, but upon
download will be named @samp{index.php.html}, which no longer matches,
say.
@menu
-* Time-Stamping Usage::
-* HTTP Time-Stamping Internals::
-* FTP Time-Stamping Internals::
+* Time-Stamping Usage::
+* HTTP Time-Stamping Internals::
+* FTP Time-Stamping Internals::
@end menu
@node Time-Stamping Usage, HTTP Time-Stamping Internals, Time-Stamping, Time-Stamping
commands.
@menu
-* Wgetrc Location:: Location of various wgetrc files.
-* Wgetrc Syntax:: Syntax of wgetrc.
-* Wgetrc Commands:: List of available commands.
-* Sample Wgetrc:: A wgetrc example.
+* Wgetrc Location:: Location of various wgetrc files.
+* Wgetrc Syntax:: Syntax of wgetrc.
+* Wgetrc Commands:: List of available commands.
+* Sample Wgetrc:: A wgetrc example.
@end menu
@node Wgetrc Location, Wgetrc Syntax, Startup File, Startup File
@item add_hostdir = on/off
Enable/disable host-prefixed file names. @samp{-nH} disables it.
+@item ask_password = on/off
+Prompt for a password for each connection established. Cannot be specified
+when @samp{--password} is being used, because they are mutually
+exclusive. Equivalent to @samp{--ask-password}.
+
+@item auth_no_challenge = on/off
+If this option is given, Wget will send Basic HTTP authentication
+information (plaintext username and password) for all requests. See
+@samp{--auth-no-challenge}.
+
@item background = on/off
Enable/disable going to background---the same as @samp{-b} (which
enables it).
@c #### Document me!
@c
@item base = @var{string}
-Consider relative @sc{url}s in @sc{url} input files forced to be
-interpreted as @sc{html} as being relative to @var{string}---the same as
-@samp{--base=@var{string}}.
+Consider relative @sc{url}s in input files (specified via the
+@samp{input} command or the @samp{--input-file}/@samp{-i} option,
+together with @samp{force_html} or @samp{--force-html})
+as being relative to @var{string}---the same as @samp{--base=@var{string}}.
@item bind_address = @var{address}
Bind to @var{address}, like the @samp{--bind-address=@var{address}}.
Define a header for HTTP downloads, like using
@samp{--header=@var{string}}.
-@item html_extension = on/off
+@item adjust_extension = on/off
Add a @samp{.html} extension to @samp{text/html} or
-@samp{application/xhtml+xml} files without it, or a @samp{.css}
-extension to @samp{text/css} files without it, like @samp{-E}.
+@samp{application/xhtml+xml} files that lack one, or a @samp{.css}
+extension to @samp{text/css} files that lack one, like
+@samp{-E}. Previously named @samp{html_extension} (still acceptable,
+but deprecated).
@item http_keep_alive = on/off
Turn the keep-alive feature on or off (defaults to on). Turning it
Specify a comma-separated list of directories you wish to follow when
downloading---the same as @samp{-I @var{string}}.
+@item iri = on/off
+When set to on, enable internationalized URI (IRI) support; the same as
+@samp{--iri}.
+
@item inet4_only = on/off
Force connecting to IPv4 addresses, off by default. You can put this
in the global init file to disable Wget's attempts to resolve and
@item input = @var{file}
Read the @sc{url}s from @var{string}, like @samp{-i @var{file}}.
+@item keep_session_cookies = on/off
+When specified, causes @samp{save_cookies = on} to also save session
+cookies. See @samp{--keep-session-cookies}.
+
@item limit_rate = @var{rate}
Limit the download speed to no more than @var{rate} bytes per second.
The same as @samp{--limit-rate=@var{rate}}.
@item load_cookies = @var{file}
Load cookies from @var{file}. See @samp{--load-cookies @var{file}}.
+@item local_encoding = @var{encoding}
+Force Wget to use @var{encoding} as the default system encoding. See
+@samp{--local-encoding}.
+
@item logfile = @var{file}
Set logfile to @var{file}, the same as @samp{-o @var{file}}.
Follow only relative links---the same as @samp{-L} (@pxref{Relative
Links}).
+@item remote_encoding = @var{encoding}
+Force Wget to use @var{encoding} as the default remote server encoding.
+See @samp{--remote-encoding}.
+
@item remove_listing = on/off
If set to on, remove @sc{ftp} listings downloaded by Wget. Setting it
to off is the same as @samp{--no-remove-listing}.
complexity.
@menu
-* Simple Usage:: Simple, basic usage of the program.
-* Advanced Usage:: Advanced tips.
-* Very Advanced Usage:: The hairy stuff.
+* Simple Usage:: Simple, basic usage of the program.
+* Advanced Usage:: Advanced tips.
+* Very Advanced Usage:: The hairy stuff.
@end menu
@node Simple Usage, Advanced Usage, Examples, Examples
This chapter contains all the stuff that could not fit anywhere else.
@menu
-* Proxies:: Support for proxy servers.
-* Distribution:: Getting the latest version.
-* Web Site:: GNU Wget's presence on the World Wide Web.
-* Mailing Lists:: Wget mailing list for announcements and discussion.
-* Internet Relay Chat:: Wget's presence on IRC.
-* Reporting Bugs:: How and where to report bugs.
-* Portability:: The systems Wget works on.
-* Signals:: Signal-handling performed by Wget.
+* Proxies:: Support for proxy servers.
+* Distribution:: Getting the latest version.
+* Web Site:: GNU Wget's presence on the World Wide Web.
+* Mailing Lists:: Wget mailing list for announcements and discussion.
+* Internet Relay Chat:: Wget's presence on IRC.
+* Reporting Bugs:: How and where to report bugs.
+* Portability:: The systems Wget works on.
+* Signals:: Signal-handling performed by Wget.
@end menu
@node Proxies, Distribution, Various, Various
This chapter contains some references I consider useful.
@menu
-* Robot Exclusion:: Wget's support for RES.
-* Security Considerations:: Security with Wget.
-* Contributors:: People who helped.
+* Robot Exclusion:: Wget's support for RES.
+* Security Considerations:: Security with Wget.
+* Contributors:: People who helped.
@end menu
@node Robot Exclusion, Security Considerations, Appendices, Appendices
@item
Ted Mielczarek---donated support for CSS.
+@item
+Saint Xavier---Support for IRIs (RFC 3987).
+
@item
People who provided donations for development---including Brian Gough.
@end itemize
Alexander Kourakos,
Martin Kraemer,
Sami Krank,
+Jay Krell,
@tex
$\Sigma\acute{\iota}\mu o\varsigma\;
\Xi\varepsilon\nu\iota\tau\acute{\epsilon}\lambda\lambda\eta\varsigma$
Matthew J.@: Mellon,
Jordan Mendelson,
Ted Mielczarek,
+Robert Millan,
Lin Zhe Min,
Jan Minar,
Tim Mooney,
Douglas E.@: Wegscheid,
Ralf Wildenhues,
Joshua David Williams,
+Benjamin Wolsey,
+Saint Xavier,
YAMAZAKI Makoto,
Jasmin Zainul,
@iftex
@ifnottex
Bojan Zdrnja,
@end ifnottex
-Kristijan Zimmer.
+Kristijan Zimmer,
+Xin Zou.
Apologies to all who I accidentally left out, and many thanks to all the
subscribers of the Wget mailing list.