@set Wget Wget
@c man title Wget The non-interactive network downloader.
-@dircategory Network Applications
+@dircategory Network applications
@direntry
-* Wget: (wget). The non-interactive network downloader.
+* Wget: (wget). Non-interactive network downloader.
@end direntry
@copying
@c man begin COPYRIGHT
Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
-2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation,
+Inc.
@iftex
Permission is granted to make and distribute verbatim copies of
@ignore
@c man begin AUTHOR
Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
-Currently maintained by Micah Cowan <micah@cowan.name>.
@c man end
@c man begin SEEALSO
This is @strong{not} the complete manual for GNU Wget.
Most of the features are fully configurable, either through command line
options, or via the initialization file @file{.wgetrc} (@pxref{Startup
File}). Wget allows you to define @dfn{global} startup files
-(@file{/usr/local/etc/wgetrc} by default) for site settings.
+(@file{/usr/local/etc/wgetrc} by default) for site settings. You can also
+specify the location of a startup file with the --config option.
+
@ignore
@c man begin FILES
that), which means that error messages and basic information still get
printed.
+@item --report-speed=@var{type}
+Output bandwidth as @var{type}. The only accepted value is @samp{bits}.
+
@cindex input-file
@item -i @var{file}
@itemx --input-file=@var{file}
For instance, if you specify @samp{http://foo/bar/a.html} for
@var{URL}, and Wget reads @samp{../baz/b.html} from the input file, it
would be resolved to @samp{http://foo/baz/b.html}.
+
+@cindex specify config
+@item --config=@var{FILE}
+Specify the location of a startup file you wish to use.
@end table
@node Download Options, Directory Options, Logging and Input File Options, Invoking
you re-mirror a site, because Wget can't tell that the local
@file{@var{X}.html} file corresponds to remote URL @samp{@var{X}} (since
it doesn't yet know that the URL produces output of type
-@samp{text/html} or @samp{application/xhtml+xml}. To prevent this
-re-downloading, you must use @samp{-k} and @samp{-K} so that the original
-version of the file will be saved as @file{@var{X}.orig} (@pxref{Recursive
-Retrieval Options}).
+@samp{text/html} or @samp{application/xhtml+xml}.
As of version 1.12, Wget will also ensure that any downloaded files of
type @samp{text/css} end in the suffix @samp{.css}, and the option was
Other than that, they work in exactly the same way. In particular,
they @emph{both} expect content of the form @code{key1=value1&key2=value2},
with percent-encoding for special characters; the only difference is
-that one expects its content as a command-line paramter and the other
+that one expects its content as a command-line parameter and the other
accepts its content from a file. In particular, @samp{--post-file} is
@emph{not} for transmitting files as form attachments: those must
appear as @code{key=value} data (with appropriate percent-coding) just
@code{Content-Disposition} headers to describe what the name of a
downloaded file should be.
+@cindex Content On Error
+@item --content-on-error
+
+If this is set to on, wget will not skip the content when the server responds
+with a http status code that indicates error.
+
@cindex Trust server names
@item --trust-server-names
systems that support @file{/dev/random}.
@end table
+@cindex WARC
+@table @samp
+@item --warc-file=@var{file}
+Use @var{file} as the destination WARC file.
+
+@item --warc-header=@var{string}
+Use @var{string} into as the warcinfo record.
+
+@item --warc-max-size=@var{size}
+Set the maximum size of the WARC files to @var{size}.
+
+@item --warc-cdx
+Write CDX index files.
+
+@item --warc-dedup=@var{file}
+Do not store records listed in this CDX file.
+
+@item --no-warc-compression
+Do not compress WARC files with GZIP.
+
+@item --no-warc-digests
+Do not calculate SHA1 digests.
+
+@item --no-warc-keep-log
+Do not store the log file in a WARC record.
+
+@item --warc-tempdir=@var{dir}
+Specify the location for temporary files created by the WARC writer.
+@end table
+
@node FTP Options, Recursive Retrieval Options, HTTPS (SSL/TLS) Options, Invoking
@section FTP Options
@item -r
@itemx --recursive
Turn on recursive retrieving. @xref{Recursive Download}, for more
-details.
+details. The default maximum depth is 5.
@item -l @var{depth}
@itemx --level=@var{depth}
Specify recursion maximum depth level @var{depth} (@pxref{Recursive
-Download}). The default maximum depth is 5.
+Download}).
@cindex proxy filling
@cindex delete after retrieval
@item -A @var{acclist}
@itemx --accept @var{acclist}
@itemx accept = @var{acclist}
+@itemx --accept-regex @var{urlregex}
+@itemx accept-regex = @var{urlregex}
The argument to @samp{--accept} option is a list of file suffixes or
patterns that Wget will download during recursive retrieval. A suffix
is the ending part of a file, and consists of ``normal'' letters,
Of course, any number of suffixes and patterns can be combined into a
comma-separated list, and given as an argument to @samp{-A}.
+The argument to @samp{--accept-regex} option is a regular expression which
+is matched against the complete URL.
+
@cindex reject wildcards
@cindex reject suffixes
@cindex wildcards, reject
@item -R @var{rejlist}
@itemx --reject @var{rejlist}
@itemx reject = @var{rejlist}
+@itemx --reject-regex @var{urlregex}
+@itemx reject-regex = @var{urlregex}
The @samp{--reject} option works the same way as @samp{--accept}, only
its logic is the reverse; Wget will download all files @emph{except} the
ones matching the suffixes (or patterns) in the list.
expansion by the shell.
@end table
+The argument to @samp{--accept-regex} option is a regular expression which
+is matched against the complete URL.
+
@noindent
The @samp{-A} and @samp{-R} options may be combined to achieve even
better fine-tuning of which files to retrieve. E.g. @samp{wget -A
Choose whether or not to print the @sc{http} and @sc{ftp} server
responses---the same as @samp{-S}.
+@item show_all_dns_entries = on/off
+When a DNS name is resolved, show all the IP addresses, not just the first
+three.
+
@item span_hosts = on/off
Same as @samp{-H}.
information from the Web, their users connect and retrieve remote data
using an authorized proxy.
+@c man begin ENVIRONMENT
Wget supports proxies for both @sc{http} and @sc{ftp} retrievals. The
standard way to specify proxy location, which Wget recognizes, is using
the following environment variables:
-@table @code
+@table @env
@item http_proxy
@itemx https_proxy
-If set, the @code{http_proxy} and @code{https_proxy} variables should
+If set, the @env{http_proxy} and @env{https_proxy} variables should
contain the @sc{url}s of the proxies for @sc{http} and @sc{https}
connections respectively.
@item ftp_proxy
This variable should contain the @sc{url} of the proxy for @sc{ftp}
-connections. It is quite common that @code{http_proxy} and
-@code{ftp_proxy} are set to the same @sc{url}.
+connections. It is quite common that @env{http_proxy} and
+@env{ftp_proxy} are set to the same @sc{url}.
@item no_proxy
This variable should contain a comma-separated list of domain extensions
proxy should @emph{not} be used for. For instance, if the value of
-@code{no_proxy} is @samp{.mit.edu}, proxy will not be used to retrieve
+@env{no_proxy} is @samp{.mit.edu}, proxy will not be used to retrieve
documents from MIT.
@end table
+@c man end
In addition to the environment variables, proxy location and settings
may be specified from within Wget itself.
GNU Wget was written by Hrvoje Nik@v{s}i@'{c} @email{hniksic@@xemacs.org},
@end iftex
@ifnottex
-GNU Wget was written by Hrvoje Niksic @email{hniksic@@xemacs.org},
+GNU Wget was written by Hrvoje Niksic @email{hniksic@@xemacs.org}.
@end ifnottex
-and it is currently maintained by Micah Cowan @email{micah@@cowan.name}.
However, the development of Wget could never have gone as far as it has, were
it not for the help of many people, either with bug reports, feature proposals,