@contents
@ifnottex
-@node Top
+@node Top, Overview, (dir), (dir)
@top Wget @value{VERSION}
@insertcopying
* Concept Index:: Topics covered by this manual.
@end menu
-@node Overview
+@node Overview, Invoking, Top, Top
@chapter Overview
@cindex overview
@cindex features
file @file{COPYING} that came with GNU Wget, for details).
@end itemize
-@node Invoking
+@node Invoking, Recursive Download, Overview, Top
@chapter Invoking
@cindex invoking
@cindex command line
* Recursive Accept/Reject Options::
@end menu
-@node URL Format
+@node URL Format, Option Syntax, Invoking, Invoking
@section URL Format
@cindex URL
@cindex URL syntax
@c man begin OPTIONS
-@node Option Syntax
+@node Option Syntax, Basic Startup Options, URL Format, Invoking
@section Option Syntax
@cindex option syntax
@cindex syntax of options
@samp{--no-} prefix. This might seem superfluous---if the default for
an affirmative option is to not do something, then why provide a way
to explicitly turn it off? But the startup file may in fact change
-the default. For instance, using @code{follow_ftp = off} in
-@file{.wgetrc} makes Wget @emph{not} follow FTP links by default, and
+the default. For instance, using @code{follow_ftp = on} in
+@file{.wgetrc} makes Wget @emph{follow} FTP links by default, and
using @samp{--no-follow-ftp} is the only way to restore the factory
default from the command line.
-@node Basic Startup Options
+@node Basic Startup Options, Logging and Input File Options, Option Syntax, Invoking
@section Basic Startup Options
@table @samp
@end table
-@node Logging and Input File Options
+@node Logging and Input File Options, Download Options, Basic Startup Options, Invoking
@section Logging and Input File Options
@table @samp
If this function is used, no @sc{url}s need be present on the command
line. If there are @sc{url}s both on the command line and in an input
file, those on the command lines will be the first ones to be
-retrieved. The @var{file} need not be an @sc{html} document (but no
-harm if it is)---it is enough if the @sc{url}s are just listed
-sequentially.
+retrieved. If @samp{--force-html} is not specified, then @var{file}
+should consist of a series of URLs, one per line.
However, if you specify @samp{--force-html}, the document will be
regarded as @samp{html}. In that case you may have problems with
@cindex base for relative links in input file
@item -B @var{URL}
@itemx --base=@var{URL}
-Prepends @var{URL} to relative links read from the file specified with
-the @samp{-i} option.
+Resolves relative links using @var{URL} as the point of reference,
+when reading links from an HTML file specified via the
+@samp{-i}/@samp{--input-file} option (together with
+@samp{--force-html}, or when the input file was fetched remotely from
+a server describing it as @sc{html}). This is equivalent to the
+presence of a @code{BASE} tag in the @sc{html} input file, with
+@var{URL} as the value for the @code{href} attribute.
+
+For instance, if you specify @samp{http://foo/bar/a.html} for
+@var{URL}, and Wget reads @samp{../baz/b.html} from the input file, it
+would be resolved to @samp{http://foo/baz/b.html}.
@end table
-@node Download Options
+@node Download Options, Directory Options, Logging and Input File Options, Invoking
@section Download Options
@table @samp
cases, the local file will be @dfn{clobbered}, or overwritten, upon
repeated download. In other cases it will be preserved.
-When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or @samp{p},
-downloading the same file in the same directory will result in the
-original copy of @var{file} being preserved and the second copy being
-named @samp{@var{file}.1}. If that file is downloaded yet again, the
-third copy will be named @samp{@var{file}.2}, and so on. When
-@samp{-nc} is specified, this behavior is suppressed, and Wget will
-refuse to download newer copies of @samp{@var{file}}. Therefore,
-``@code{no-clobber}'' is actually a misnomer in this mode---it's not
-clobbering that's prevented (as the numeric suffixes were already
-preventing clobbering), but rather the multiple version saving that's
-prevented.
-
-When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N}
-or @samp{-nc}, re-downloading a file will result in the new copy
-simply overwriting the old. Adding @samp{-nc} will prevent this
-behavior, instead causing the original version to be preserved and any
-newer copies on the server to be ignored.
+When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or
+@samp{-p}, downloading the same file in the same directory will result
+in the original copy of @var{file} being preserved and the second copy
+being named @samp{@var{file}.1}. If that file is downloaded yet
+again, the third copy will be named @samp{@var{file}.2}, and so on.
+(This is also the behavior with @samp{-nd}, even if @samp{-r} or
+@samp{-p} are in effect.) When @samp{-nc} is specified, this behavior
+is suppressed, and Wget will refuse to download newer copies of
+@samp{@var{file}}. Therefore, ``@code{no-clobber}'' is actually a
+misnomer in this mode---it's not clobbering that's prevented (as the
+numeric suffixes were already preventing clobbering), but rather the
+multiple version saving that's prevented.
+
+When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N},
+@samp{-nd}, or @samp{-nc}, re-downloading a file will result in the
+new copy simply overwriting the old. Adding @samp{-nc} will prevent
+this behavior, instead causing the original version to be preserved
+and any newer copies on the server to be ignored.
When running Wget with @samp{-N}, with or without @samp{-r} or
@samp{-p}, the decision as to whether or not to download a newer copy
given file, then waiting 2 seconds after the second failure on that
file, up to the maximum number of @var{seconds} you specify. Therefore,
a value of 10 will actually make Wget wait up to (1 + 2 + ... + 10) = 55
-seconds per file.
+seconds per file.
-Note that this option is turned on by default in the global
-@file{wgetrc} file.
+By default, Wget will assume a value of 10 seconds.
@cindex wait, random
@cindex random wait
@item --ask-password
Prompt for a password for each connection established. Cannot be specified
when @samp{--password} is being used, because they are mutually exclusive.
+
+@cindex iri support
+@cindex idn support
+@item --no-iri
+
+Turn off internationalized URI (IRI) support. Use @samp{--iri} to
+turn it on. IRI support is activated by default.
+
+You can set the default state of IRI support using the @code{iri}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
+
+@cindex local encoding
+@item --local-encoding=@var{encoding}
+
+Force Wget to use @var{encoding} as the default system encoding. That affects
+how Wget converts URLs specified as arguments from locale to @sc{utf-8} for
+IRI support.
+
+Wget use the function @code{nl_langinfo()} and then the @code{CHARSET}
+environment variable to get the locale. If it fails, @sc{ascii} is used.
+
+You can set the default local encoding using the @code{local_encoding}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
+
+@cindex remote encoding
+@item --remote-encoding=@var{encoding}
+
+Force Wget to use @var{encoding} as the default remote server encoding.
+That affects how Wget converts URIs found in files from remote encoding
+to @sc{utf-8} during a recursive fetch. This options is only useful for
+IRI support, for the interpretation of non-@sc{ascii} characters.
+
+For HTTP, remote encoding can be found in HTTP @code{Content-Type}
+header and in HTML @code{Content-Type http-equiv} meta tag.
+
+You can set the default encoding using the @code{remoteencoding}
+command in @file{.wgetrc}. That setting may be overridden from the
+command line.
@end table
-@node Directory Options
+@node Directory Options, HTTP Options, Download Options, Invoking
@section Directory Options
@table @samp
current directory).
@end table
-@node HTTP Options
+@node HTTP Options, HTTPS (SSL/TLS) Options, Directory Options, Invoking
@section HTTP Options
@table @samp
@cindex POST
@item --post-data=@var{string}
@itemx --post-file=@var{file}
-Use POST as the method for all HTTP requests and send the specified data
-in the request body. @code{--post-data} sends @var{string} as data,
-whereas @code{--post-file} sends the contents of @var{file}. Other than
-that, they work in exactly the same way.
+Use POST as the method for all HTTP requests and send the specified
+data in the request body. @samp{--post-data} sends @var{string} as
+data, whereas @samp{--post-file} sends the contents of @var{file}.
+Other than that, they work in exactly the same way. In particular,
+they @emph{both} expect content of the form @code{key1=value1&key2=value2},
+with percent-encoding for special characters; the only difference is
+that one expects its content as a command-line paramter and the other
+accepts its content from a file. In particular, @samp{--post-file} is
+@emph{not} for transmitting files as form attachments: those must
+appear as @code{key=value} data (with appropriate percent-coding) just
+like everything else. Wget does not currently support
+@code{multipart/form-data} for transmitting POST data; only
+@code{application/x-www-form-urlencoded}. Only one of
+@samp{--post-data} and @samp{--post-file} should be specified.
Please be aware that Wget needs to know the size of the POST data in
advance. Therefore the argument to @code{--post-file} must be a regular
@end table
-@node HTTPS (SSL/TLS) Options
+@node HTTPS (SSL/TLS) Options, FTP Options, HTTP Options, Invoking
@section HTTPS (SSL/TLS) Options
@cindex SSL
systems that support @file{/dev/random}.
@end table
-@node FTP Options
+@node FTP Options, Recursive Retrieval Options, HTTPS (SSL/TLS) Options, Invoking
@section FTP Options
@table @samp
case.
@end table
-@node Recursive Retrieval Options
+@node Recursive Retrieval Options, Recursive Accept/Reject Options, FTP Options, Invoking
@section Recursive Retrieval Options
@table @samp
option to turn it on.
@end table
-@node Recursive Accept/Reject Options
+@node Recursive Accept/Reject Options, , Recursive Retrieval Options, Invoking
@section Recursive Accept/Reject Options
@table @samp
@c man end
-@node Recursive Download
+@node Recursive Download, Following Links, Invoking, Top
@chapter Recursive Download
@cindex recursion
@cindex retrieving
Recursive retrieval should be used with care. Don't say you were not
warned.
-@node Following Links
+@node Following Links, Time-Stamping, Recursive Download, Top
@chapter Following Links
@cindex links
@cindex following links
* FTP Links:: Following FTP links.
@end menu
-@node Spanning Hosts
+@node Spanning Hosts, Types of Files, Following Links, Following Links
@section Spanning Hosts
@cindex spanning hosts
@cindex hosts, spanning
@end table
-@node Types of Files
+@node Types of Files, Directory-Based Limits, Spanning Hosts, Following Links
@section Types of Files
@cindex types of files
This behavior, too, is considered less-than-desirable, and may change
in a future version of Wget.
-@node Directory-Based Limits
+@node Directory-Based Limits, Relative Links, Types of Files, Following Links
@section Directory-Based Limits
@cindex directories
@cindex directory limits
meaningless, as its parent is @samp{/}).
@end table
-@node Relative Links
+@node Relative Links, FTP Links, Directory-Based Limits, Following Links
@section Relative Links
@cindex relative links
This option is probably not very useful and might be removed in a future
release.
-@node FTP Links
+@node FTP Links, , Relative Links, Following Links
@section Following FTP Links
@cindex following ftp links
Also note that followed links to @sc{ftp} directories will not be
retrieved recursively further.
-@node Time-Stamping
+@node Time-Stamping, Startup File, Following Links, Top
@chapter Time-Stamping
@cindex time-stamping
@cindex timestamping
* FTP Time-Stamping Internals::
@end menu
-@node Time-Stamping Usage
+@node Time-Stamping Usage, HTTP Time-Stamping Internals, Time-Stamping, Time-Stamping
@section Time-Stamping Usage
@cindex time-stamping usage
@cindex usage, time-stamping
directory listing with dates in a format that Wget can parse
(@pxref{FTP Time-Stamping Internals}).
-@node HTTP Time-Stamping Internals
+@node HTTP Time-Stamping Internals, FTP Time-Stamping Internals, Time-Stamping Usage, Time-Stamping
@section HTTP Time-Stamping Internals
@cindex http time-stamping
Arguably, @sc{http} time-stamping should be implemented using the
@code{If-Modified-Since} request.
-@node FTP Time-Stamping Internals
+@node FTP Time-Stamping Internals, , HTTP Time-Stamping Internals, Time-Stamping
@section FTP Time-Stamping Internals
@cindex ftp time-stamping
@code{wu-ftpd}), which returns the exact time of the specified file.
Wget may support this command in the future.
-@node Startup File
+@node Startup File, Examples, Time-Stamping, Top
@chapter Startup File
@cindex startup file
@cindex wgetrc
* Sample Wgetrc:: A wgetrc example.
@end menu
-@node Wgetrc Location
+@node Wgetrc Location, Wgetrc Syntax, Startup File, Startup File
@section Wgetrc Location
@cindex wgetrc location
@cindex location of wgetrc
system-wide wgetrc (in @file{/usr/local/etc/wgetrc} by default).
Fascist admins, away!
-@node Wgetrc Syntax
+@node Wgetrc Syntax, Wgetrc Commands, Wgetrc Location, Startup File
@section Wgetrc Syntax
@cindex wgetrc syntax
@cindex syntax of wgetrc
reject =
@end example
-@node Wgetrc Commands
+@node Wgetrc Commands, Sample Wgetrc, Wgetrc Syntax, Startup File
@section Wgetrc Commands
@cindex wgetrc commands
@c #### Document me!
@c
@item base = @var{string}
-Consider relative @sc{url}s in @sc{url} input files forced to be
-interpreted as @sc{html} as being relative to @var{string}---the same as
-@samp{--base=@var{string}}.
+Consider relative @sc{url}s in input files (specified via the
+@samp{input} command or the @samp{--input-file}/@samp{-i} option,
+together with @samp{force_html} or @samp{--force-html})
+as being relative to @var{string}---the same as @samp{--base=@var{string}}.
@item bind_address = @var{address}
Bind to @var{address}, like the @samp{--bind-address=@var{address}}.
Specify a comma-separated list of directories you wish to follow when
downloading---the same as @samp{-I @var{string}}.
+@item iri = on/off
+When set to on, enable internationalized URI (IRI) support; the same as
+@samp{--iri}.
+
@item inet4_only = on/off
Force connecting to IPv4 addresses, off by default. You can put this
in the global init file to disable Wget's attempts to resolve and
@item load_cookies = @var{file}
Load cookies from @var{file}. See @samp{--load-cookies @var{file}}.
+@item local_encoding = @var{encoding}
+Force Wget to use @var{encoding} as the default system encoding. See
+@samp{--local-encoding}.
+
@item logfile = @var{file}
Set logfile to @var{file}, the same as @samp{-o @var{file}}.
Follow only relative links---the same as @samp{-L} (@pxref{Relative
Links}).
+@item remote_encoding = @var{encoding}
+Force Wget to use @var{encoding} as the default remote server encoding.
+See @samp{--remote-encoding}.
+
@item remove_listing = on/off
If set to on, remove @sc{ftp} listings downloaded by Wget. Setting it
to off is the same as @samp{--no-remove-listing}.
turned on by default in the global @file{wgetrc}.
@end table
-@node Sample Wgetrc
+@node Sample Wgetrc, , Wgetrc Commands, Startup File
@section Sample Wgetrc
@cindex sample wgetrc
@include sample.wgetrc.munged_for_texi_inclusion
@end example
-@node Examples
+@node Examples, Various, Startup File, Top
@chapter Examples
@cindex examples
* Very Advanced Usage:: The hairy stuff.
@end menu
-@node Simple Usage
+@node Simple Usage, Advanced Usage, Examples, Examples
@section Simple Usage
@itemize @bullet
@end example
@end itemize
-@node Advanced Usage
+@node Advanced Usage, Very Advanced Usage, Simple Usage, Examples
@section Advanced Usage
@itemize @bullet
@end example
@end itemize
-@node Very Advanced Usage
+@node Very Advanced Usage, , Advanced Usage, Examples
@section Very Advanced Usage
@cindex mirroring
@end itemize
@c man end
-@node Various
+@node Various, Appendices, Examples, Top
@chapter Various
@cindex various
* Signals:: Signal-handling performed by Wget.
@end menu
-@node Proxies
+@node Proxies, Distribution, Various, Various
@section Proxies
@cindex proxies
settings @code{proxy_user} and @code{proxy_password} to set the proxy
username and password.
-@node Distribution
+@node Distribution, Web Site, Proxies, Various
@section Distribution
@cindex latest version
Wget @value{VERSION} can be found at
@url{ftp://ftp.gnu.org/pub/gnu/wget/wget-@value{VERSION}.tar.gz}
-@node Web Site
+@node Web Site, Mailing Lists, Distribution, Various
@section Web Site
@cindex web site
information resides at ``The Wget Wgiki'',
@url{http://wget.addictivecode.org/}.
-@node Mailing Lists
+@node Mailing Lists, Internet Relay Chat, Web Site, Various
@section Mailing Lists
@cindex mailing list
@cindex list
@url{http://news.gmane.org/gmane.comp.web.wget.patches}.
@end itemize
-@node Internet Relay Chat
+@node Internet Relay Chat, Reporting Bugs, Mailing Lists, Various
@section Internet Relay Chat
@cindex Internet Relay Chat
@cindex IRC
In addition to the mailinglists, we also have a support channel set up
via IRC at @code{irc.freenode.org}, @code{#wget}. Come check it out!
-@node Reporting Bugs
+@node Reporting Bugs, Portability, Internet Relay Chat, Various
@section Reporting Bugs
@cindex bugs
@cindex reporting bugs
@end enumerate
@c man end
-@node Portability
+@node Portability, Signals, Reporting Bugs, Various
@section Portability
@cindex portability
@cindex operating systems
Vanem; a port to VMS is maintained by Steven Schweda, and is available
at @url{http://antinode.org/}.
-@node Signals
+@node Signals, , Portability, Various
@section Signals
@cindex signal handling
@cindex hangup
Other than that, Wget will not try to interfere with signals in any way.
@kbd{C-c}, @code{kill -TERM} and @code{kill -KILL} should kill it alike.
-@node Appendices
+@node Appendices, Copying this manual, Various, Top
@chapter Appendices
This chapter contains some references I consider useful.
* Contributors:: People who helped.
@end menu
-@node Robot Exclusion
+@node Robot Exclusion, Security Considerations, Appendices, Appendices
@section Robot Exclusion
@cindex robot exclusion
@cindex robots.txt
@file{.wgetrc}. You can achieve the same effect from the command line
using the @code{-e} switch, e.g. @samp{wget -e robots=off @var{url}...}.
-@node Security Considerations
+@node Security Considerations, Contributors, Robot Exclusion, Appendices
@section Security Considerations
@cindex security
me).
@end enumerate
-@node Contributors
+@node Contributors, , Security Considerations, Appendices
@section Contributors
@cindex contributors
@item
Ted Mielczarek---donated support for CSS.
+@item
+Saint Xavier---Support for IRIs (RFC 3987).
+
@item
People who provided donations for development---including Brian Gough.
@end itemize
Alexander Kourakos,
Martin Kraemer,
Sami Krank,
+Jay Krell,
@tex
$\Sigma\acute{\iota}\mu o\varsigma\;
\Xi\varepsilon\nu\iota\tau\acute{\epsilon}\lambda\lambda\eta\varsigma$
Matthew J.@: Mellon,
Jordan Mendelson,
Ted Mielczarek,
+Robert Millan,
Lin Zhe Min,
Jan Minar,
Tim Mooney,
Douglas E.@: Wegscheid,
Ralf Wildenhues,
Joshua David Williams,
+Benjamin Wolsey,
+Saint Xavier,
YAMAZAKI Makoto,
Jasmin Zainul,
@iftex
@ifnottex
Bojan Zdrnja,
@end ifnottex
-Kristijan Zimmer.
+Kristijan Zimmer,
+Xin Zou.
Apologies to all who I accidentally left out, and many thanks to all the
subscribers of the Wget mailing list.
-@node Copying this manual
+@node Copying this manual, Concept Index, Appendices, Top
@appendix Copying this manual
@menu
* GNU Free Documentation License:: Licnse for copying this manual.
@end menu
-@node GNU Free Documentation License
+@node GNU Free Documentation License, , Copying this manual, Copying this manual
@appendixsec GNU Free Documentation License
@cindex FDL, GNU Free Documentation License
@include fdl.texi
-@node Concept Index
+@node Concept Index, , Copying this manual, Top
@unnumbered Concept Index
@printindex cp