X-Git-Url: http://sjero.net/git/?p=wget;a=blobdiff_plain;f=doc%2Fwget.texi;h=cced7edda118e268b6f290ae272380179044dbce;hp=aab1a8907df9cd7bbb1fa49e2077594774c486ce;hb=42c78fdd71c311cf96210b709ec0a18ef45ef87f;hpb=c784c334d3ddaeb6c628931eca87056e152181fe diff --git a/doc/wget.texi b/doc/wget.texi index aab1a890..cced7edd 100644 --- a/doc/wget.texi +++ b/doc/wget.texi @@ -20,9 +20,9 @@ @set Wget Wget @c man title Wget The non-interactive network downloader. -@dircategory Network Applications +@dircategory Network applications @direntry -* Wget: (wget). The non-interactive network downloader. +* Wget: (wget). Non-interactive network downloader. @end direntry @copying @@ -30,8 +30,9 @@ This file documents the GNU Wget utility for downloading network data. @c man begin COPYRIGHT -Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002, -2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. +Copyright @copyright{} 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, +2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, +Inc. @iftex Permission is granted to make and distribute verbatim copies of @@ -63,7 +64,6 @@ Documentation License''. @ignore @c man begin AUTHOR Originally written by Hrvoje Niksic . -Currently maintained by Micah Cowan . @c man end @c man begin SEEALSO This is @strong{not} the complete manual for GNU Wget. @@ -89,17 +89,17 @@ Info entry for @file{wget}. @end ifnottex @menu -* Overview:: Features of Wget. -* Invoking:: Wget command-line arguments. -* Recursive Download:: Downloading interlinked pages. -* Following Links:: The available methods of chasing links. -* Time-Stamping:: Mirroring according to time-stamps. -* Startup File:: Wget's initialization file. -* Examples:: Examples of usage. -* Various:: The stuff that doesn't fit anywhere else. -* Appendices:: Some useful references. -* Copying this manual:: You may give out copies of Wget and of this manual. -* Concept Index:: Topics covered by this manual. +* Overview:: Features of Wget. +* Invoking:: Wget command-line arguments. +* Recursive Download:: Downloading interlinked pages. +* Following Links:: The available methods of chasing links. +* Time-Stamping:: Mirroring according to time-stamps. +* Startup File:: Wget's initialization file. +* Examples:: Examples of usage. +* Various:: The stuff that doesn't fit anywhere else. +* Appendices:: Some useful references. +* Copying this manual:: You may give out copies of this manual. +* Concept Index:: Topics covered by this manual. @end menu @node Overview, Invoking, Top, Top @@ -190,7 +190,9 @@ gauge can be customized to your preferences. Most of the features are fully configurable, either through command line options, or via the initialization file @file{.wgetrc} (@pxref{Startup File}). Wget allows you to define @dfn{global} startup files -(@file{/usr/local/etc/wgetrc} by default) for site settings. +(@file{/usr/local/etc/wgetrc} by default) for site settings. You can also +specify the location of a startup file with the --config option. + @ignore @c man begin FILES @@ -235,17 +237,18 @@ command to @file{.wgetrc} (@pxref{Startup File}), or specifying it on the command line. @menu -* URL Format:: -* Option Syntax:: -* Basic Startup Options:: -* Logging and Input File Options:: -* Download Options:: -* Directory Options:: -* HTTP Options:: -* HTTPS (SSL/TLS) Options:: -* FTP Options:: -* Recursive Retrieval Options:: -* Recursive Accept/Reject Options:: +* URL Format:: +* Option Syntax:: +* Basic Startup Options:: +* Logging and Input File Options:: +* Download Options:: +* Directory Options:: +* HTTP Options:: +* HTTPS (SSL/TLS) Options:: +* FTP Options:: +* Recursive Retrieval Options:: +* Recursive Accept/Reject Options:: +* Exit Status:: @end menu @node URL Format, Option Syntax, Invoking, Invoking @@ -351,7 +354,7 @@ like: wget -drc @var{URL} @end example -This is a complete equivalent of: +This is completely equivalent to: @example wget -d -r -c @var{URL} @@ -476,6 +479,9 @@ Turn off verbose without being completely quiet (use @samp{-q} for that), which means that error messages and basic information still get printed. +@item --report-speed=@var{type} +Output bandwidth as @var{type}. The only accepted value is @samp{bits}. + @cindex input-file @item -i @var{file} @itemx --input-file=@var{file} @@ -523,6 +529,10 @@ presence of a @code{BASE} tag in the @sc{html} input file, with For instance, if you specify @samp{http://foo/bar/a.html} for @var{URL}, and Wget reads @samp{../baz/b.html} from the input file, it would be resolved to @samp{http://foo/baz/b.html}. + +@cindex specify config +@item --config=@var{FILE} +Specify the location of a startup file you wish to use. @end table @node Download Options, Directory Options, Logging and Input File Options, Invoking @@ -540,10 +550,10 @@ IPs. @cindex retries @cindex tries -@cindex number of retries +@cindex number of tries @item -t @var{number} @itemx --tries=@var{number} -Set number of retries to @var{number}. Specify 0 or @samp{inf} for +Set number of tries to @var{number}. Specify 0 or @samp{inf} for infinite retrying. The default is to retry 20 times, with the exception of fatal errors like ``connection refused'' or ``not found'' (404), which are not retried. @@ -578,7 +588,8 @@ some cases where this behavior can actually have some use. Note that a combination with @samp{-k} is only permitted when downloading a single document, as in that case it will just convert all relative URIs to external ones; @samp{-k} makes no sense for -multiple URIs when they're all being downloaded to a single file. +multiple URIs when they're all being downloaded to a single file; +@samp{-k} can be used only when the output is a regular file. @cindex clobbering, file @cindex downloading multiple times @@ -619,6 +630,13 @@ Note that when @samp{-nc} is specified, files with the suffixes @samp{.html} or @samp{.htm} will be loaded from the local disk and parsed as if they had been retrieved from the Web. +@cindex backing up files +@item --backups=@var{backups} +Before (over)writing a file, back up an existing file by adding a +@samp{.1} suffix (@samp{_1} on VMS) to the file name. Such backup +files are rotated to @samp{.2}, @samp{.3}, and so on, up to +@var{backups} (and lost beyond that). + @cindex continue retrieval @cindex incomplete downloads @cindex resume download @@ -704,9 +722,12 @@ different meaning to one dot. With the @code{default} style each dot represents 1K, there are ten dots in a cluster and 50 dots in a line. The @code{binary} style has a more ``computer''-like orientation---8K dots, 16-dots clusters and 48 dots per line (which makes for 384K -lines). The @code{mega} style is suitable for downloading very large +lines). The @code{mega} style is suitable for downloading large files---each dot represents 64K retrieved, there are eight dots in a cluster, and 48 dots on each line (so each line contains 3M). +If @code{mega} is not enough then you can use the @code{giga} +style---each dot represents 1M retrieved, there are eight dots in a +cluster, and 32 dots on each line (so each line contains 32M). Note that you can set the default style using the @code{progress} command in @file{.wgetrc}. That setting may be overridden from the @@ -718,6 +739,16 @@ use @samp{--progress=bar:force}. @itemx --timestamping Turn on time-stamping. @xref{Time-Stamping}, for details. +@item --no-use-server-timestamps +Don't set the local file's timestamp by the one on the server. + +By default, when a file is downloaded, its timestamps are set to +match those from the remote file. This allows the use of +@samp{--timestamping} on subsequent invocations of wget. However, it +is sometimes useful to base the local file's timestamp on when it was +actually downloaded; for that purpose, the +@samp{--no-use-server-timestamps} option has been provided. + @cindex server response, print @item -S @itemx --server-response @@ -829,9 +860,7 @@ If you don't want Wget to wait between @emph{every} retrieval, but only between retries of failed downloads, you can use this option. Wget will use @dfn{linear backoff}, waiting 1 second after the first failure on a given file, then waiting 2 seconds after the second failure on that -file, up to the maximum number of @var{seconds} you specify. Therefore, -a value of 10 will actually make Wget wait up to (1 + 2 + ... + 10) = 55 -seconds per file. +file, up to the maximum number of @var{seconds} you specify. By default, Wget will assume a value of 10 seconds. @@ -856,7 +885,7 @@ recommendation to block many unrelated users from a web site due to the actions of one. @cindex proxy -@itemx --no-proxy +@item --no-proxy Don't use proxies, even if the appropriate @code{*_proxy} environment variable is defined. @@ -957,7 +986,7 @@ are outside the range of @sc{ascii} characters (that is, greater than whose encoding does not match the one used locally. @cindex IPv6 -@itemx -4 +@item -4 @itemx --inet4-only @itemx -6 @itemx --inet6-only @@ -1063,6 +1092,13 @@ header and in HTML @code{Content-Type http-equiv} meta tag. You can set the default encoding using the @code{remoteencoding} command in @file{.wgetrc}. That setting may be overridden from the command line. + +@cindex unlink +@item --unlink + +Force Wget to unlink file instead of clobbering existing file. This +option is useful for downloading to the directory with hardlinks. + @end table @node Directory Options, HTTP Options, Download Options, Invoking @@ -1148,8 +1184,9 @@ Use @var{name} as the default file name when it isn't known (i.e., for URLs that end in a slash), instead of @file{index.html}. @cindex .html extension +@cindex .css extension @item -E -@itemx --html-extension +@itemx --adjust-extension If a file of type @samp{application/xhtml+xml} or @samp{text/html} is downloaded and the URL does not end with the regexp @samp{\.[Hh][Tt][Mm][Ll]?}, this option will cause the suffix @samp{.html} @@ -1164,15 +1201,17 @@ Note that filenames changed in this way will be re-downloaded every time you re-mirror a site, because Wget can't tell that the local @file{@var{X}.html} file corresponds to remote URL @samp{@var{X}} (since it doesn't yet know that the URL produces output of type -@samp{text/html} or @samp{application/xhtml+xml}. To prevent this -re-downloading, you must use @samp{-k} and @samp{-K} so that the original -version of the file will be saved as @file{@var{X}.orig} (@pxref{Recursive -Retrieval Options}). +@samp{text/html} or @samp{application/xhtml+xml}. As of version 1.12, Wget will also ensure that any downloaded files of -type @samp{text/css} end in the suffix @samp{.css}. Obviously, this -makes the name @samp{--html-extension} misleading; a better name is -expected to be offered as an alternative in the near future. +type @samp{text/css} end in the suffix @samp{.css}, and the option was +renamed from @samp{--html-extension}, to better reflect its new +behavior. The old option name is still acceptable, but should now be +considered deprecated. + +At some point in the future, this option may well be expanded to +include suffixes for other types of content, including content types +that are not parsed by Wget. @cindex http user @cindex http password @@ -1419,7 +1458,7 @@ data, whereas @samp{--post-file} sends the contents of @var{file}. Other than that, they work in exactly the same way. In particular, they @emph{both} expect content of the form @code{key1=value1&key2=value2}, with percent-encoding for special characters; the only difference is -that one expects its content as a command-line paramter and the other +that one expects its content as a command-line parameter and the other accepts its content from a file. In particular, @samp{--post-file} is @emph{not} for transmitting files as form attachments: those must appear as @code{key=value} data (with appropriate percent-coding) just @@ -1428,6 +1467,11 @@ like everything else. Wget does not currently support @code{application/x-www-form-urlencoded}. Only one of @samp{--post-data} and @samp{--post-file} should be specified. +Please note that wget does not require the content to be of the form +@code{key1=value1&key2=value2}, and neither does it test for it. Wget will +simply transmit whatever data is provided to it. Most servers however expect +the POST data to be in the above format when processing HTML Forms. + Please be aware that Wget needs to know the size of the POST data in advance. Therefore the argument to @code{--post-file} must be a regular file; specifying a FIFO or something like @file{/dev/stdin} won't work. @@ -1438,14 +1482,15 @@ use chunked unless it knows it's talking to an HTTP/1.1 server. And it can't know that until it receives a response, which in turn requires the request to have been completed -- a chicken-and-egg problem. -Note: if Wget is redirected after the POST request is completed, it -will not send the POST data to the redirected URL. This is because -URLs that process POST often respond with a redirection to a regular -page, which does not desire or accept POST. It is not completely -clear that this behavior is optimal; if it doesn't work out, it might -be changed in the future. +Note: As of version 1.15 if Wget is redirected after the POST request is +completed, its behaviour will depend on the response code returned by the +server. In case of a 301 Moved Permanently, 302 Moved Temporarily or +307 Temporary Redirect, Wget will, in accordance with RFC2616, continue +to send a POST request. +In case a server wants the client to change the Request method upon +redirection, it should send a 303 See Other response code. -This example shows how to log to a server using POST and then proceed to +This example shows how to log in to a server using POST and then proceed to download the desired pages, presumably only accessible to authorized users: @@ -1468,6 +1513,37 @@ them (and neither will browsers) and the @file{cookies.txt} file will be empty. In that case use @samp{--keep-session-cookies} along with @samp{--save-cookies} to force saving of session cookies. +@cindex Other HTTP Methods +@item --method=@var{HTTP-Method} +For the purpose of RESTful scripting, Wget allows sending of other HTTP Methods +without the need to explicitly set them using @samp{--header=Header-Line}. +Wget will use whatever string is passed to it after @samp{--method} as the HTTP +Method to the server. + +@item --body-data=@var{Data-String} +@itemx --body-file=@var{Data-File} +Must be set when additional data needs to be sent to the server along with the +Method specified using @samp{--method}. @samp{--body-data} sends @var{string} as +data, whereas @samp{--body-file} sends the contents of @var{file}. Other than that, +they work in exactly the same way. + +Currently, @samp{--body-file} is @emph{not} for transmitting files as a whole. +Wget does not currently support @code{multipart/form-data} for transmitting data; +only @code{application/x-www-form-urlencoded}. In the future, this may be changed +so that wget sends the @samp{--body-file} as a complete file instead of sending its +contents to the server. Please be aware that Wget needs to know the contents of +BODY Data in advance, and hence the argument to @samp{--body-file} should be a +regular file. See @samp{--post-file} for a more detailed explanation. +Only one of @samp{--body-data} and @samp{--body-file} should be specified. + +If Wget is redirected after the request is completed, Wget will +suspend the current method and send a GET request till the redirection +is completed. This is true for all redirection response codes except +307 Temporary Redirect which is used to explicitly specify that the +request method should @emph{not} change. Another exception is when +the method is set to @code{POST}, in which case the redirection rules +specified under @samp{--post-data} are followed. + @cindex Content-Disposition @item --content-disposition @@ -1480,6 +1556,19 @@ This option is useful for some file-downloading CGI programs that use @code{Content-Disposition} headers to describe what the name of a downloaded file should be. +@cindex Content On Error +@item --content-on-error + +If this is set to on, wget will not skip the content when the server responds +with a http status code that indicates error. + +@cindex Trust server names +@item --trust-server-names + +If this is set to on, on a redirect the last component of the +redirection URL will be used as the local file name. By default it is +used the last component in the original URL. + @cindex authentication @item --auth-no-challenge @@ -1517,6 +1606,9 @@ buggy SSL server implementations that make it hard for OpenSSL to choose the correct protocol version. Fortunately, such servers are quite rare. +@item --https-only +When in recursive mode, only HTTPS links are followed. + @cindex SSL certificate, check @item --no-check-certificate Don't check the server certificate against the available certificate @@ -1619,6 +1711,36 @@ not used), EGD is never contacted. EGD is not needed on modern Unix systems that support @file{/dev/random}. @end table +@cindex WARC +@table @samp +@item --warc-file=@var{file} +Use @var{file} as the destination WARC file. + +@item --warc-header=@var{string} +Use @var{string} into as the warcinfo record. + +@item --warc-max-size=@var{size} +Set the maximum size of the WARC files to @var{size}. + +@item --warc-cdx +Write CDX index files. + +@item --warc-dedup=@var{file} +Do not store records listed in this CDX file. + +@item --no-warc-compression +Do not compress WARC files with GZIP. + +@item --no-warc-digests +Do not calculate SHA1 digests. + +@item --no-warc-keep-log +Do not store the log file in a WARC record. + +@item --warc-tempdir=@var{dir} +Specify the location for temporary files created by the WARC writer. +@end table + @node FTP Options, Recursive Retrieval Options, HTTPS (SSL/TLS) Options, Invoking @section FTP Options @@ -1704,6 +1826,10 @@ in some rare firewall configurations, active FTP actually works when passive FTP doesn't. If you suspect this to be the case, use this option, or set @code{passive_ftp=off} in your init file. +@cindex file permissions +@item --preserve-permissions +Preserve remote file permissions instead of permissions set by umask. + @cindex symbolic links, retrieving @item --retr-symlinks Usually, when retrieving @sc{ftp} directories recursively and a symbolic @@ -1731,12 +1857,12 @@ case. @item -r @itemx --recursive Turn on recursive retrieving. @xref{Recursive Download}, for more -details. +details. The default maximum depth is 5. @item -l @var{depth} @itemx --level=@var{depth} Specify recursion maximum depth level @var{depth} (@pxref{Recursive -Download}). The default maximum depth is 5. +Download}). @cindex proxy filling @cindex delete after retrieval @@ -1929,7 +2055,7 @@ If, for whatever reason, you want strict comment parsing, use this option to turn it on. @end table -@node Recursive Accept/Reject Options, , Recursive Retrieval Options, Invoking +@node Recursive Accept/Reject Options, Exit Status, Recursive Retrieval Options, Invoking @section Recursive Accept/Reject Options @table @samp @@ -1941,13 +2067,22 @@ any of the wildcard characters, @samp{*}, @samp{?}, @samp{[} or @samp{]}, appear in an element of @var{acclist} or @var{rejlist}, it will be treated as a pattern, rather than a suffix. +@item --accept-regex @var{urlregex} +@itemx --reject-regex @var{urlregex} +Specify a regular expression to accept or reject the complete URL. + +@item --regex-type @var{regextype} +Specify the regular expression type. Possible types are @samp{posix} or +@samp{pcre}. Note that to be able to use @samp{pcre} type, wget has to be +compiled with libpcre support. + @item -D @var{domain-list} @itemx --domains=@var{domain-list} Set domains to be followed. @var{domain-list} is a comma-separated list of domains. Note that it does @emph{not} turn on @samp{-H}. @item --exclude-domains @var{domain-list} -Specify the domains that are @emph{not} to be followed. +Specify the domains that are @emph{not} to be followed (@pxref{Spanning Hosts}). @cindex follow FTP links @@ -2024,6 +2159,57 @@ This is a useful option, since it guarantees that only the files @c man end +@node Exit Status, , Recursive Accept/Reject Options, Invoking +@section Exit Status + +@c man begin EXITSTATUS + +Wget may return one of several error codes if it encounters problems. + + +@table @asis +@item 0 +No problems occurred. + +@item 1 +Generic error code. + +@item 2 +Parse error---for instance, when parsing command-line options, the +@samp{.wgetrc} or @samp{.netrc}... + +@item 3 +File I/O error. + +@item 4 +Network failure. + +@item 5 +SSL verification failure. + +@item 6 +Username/password authentication failure. + +@item 7 +Protocol errors. + +@item 8 +Server issued an error response. +@end table + + +With the exceptions of 0 and 1, the lower-numbered exit codes take +precedence over higher-numbered ones, when multiple types of errors +are encountered. + +In versions of Wget prior to 1.12, Wget's exit status tended to be +unhelpful and inconsistent. Recursive downloads would virtually always +return 0 (success), regardless of any issues encountered, and +non-recursive fetches only returned the status corresponding to the +most recently-attempted download. + +@c man end + @node Recursive Download, Following Links, Invoking, Top @chapter Recursive Download @cindex recursion @@ -2109,11 +2295,11 @@ Wget possesses several mechanisms that allows you to fine-tune which links it will follow. @menu -* Spanning Hosts:: (Un)limiting retrieval based on host name. -* Types of Files:: Getting only certain files. -* Directory-Based Limits:: Getting only certain directories. -* Relative Links:: Follow relative links only. -* FTP Links:: Following FTP links. +* Spanning Hosts:: (Un)limiting retrieval based on host name. +* Types of Files:: Getting only certain files. +* Directory-Based Limits:: Getting only certain directories. +* Relative Links:: Follow relative links only. +* FTP Links:: Following FTP links. @end menu @node Spanning Hosts, Types of Files, Following Links, Following Links @@ -2194,6 +2380,8 @@ in @file{.wgetrc}. @item -A @var{acclist} @itemx --accept @var{acclist} @itemx accept = @var{acclist} +@itemx --accept-regex @var{urlregex} +@itemx accept-regex = @var{urlregex} The argument to @samp{--accept} option is a list of file suffixes or patterns that Wget will download during recursive retrieval. A suffix is the ending part of a file, and consists of ``normal'' letters, @@ -2210,6 +2398,9 @@ a description of how pattern matching works. Of course, any number of suffixes and patterns can be combined into a comma-separated list, and given as an argument to @samp{-A}. +The argument to @samp{--accept-regex} option is a regular expression which +is matched against the complete URL. + @cindex reject wildcards @cindex reject suffixes @cindex wildcards, reject @@ -2217,6 +2408,8 @@ comma-separated list, and given as an argument to @samp{-A}. @item -R @var{rejlist} @itemx --reject @var{rejlist} @itemx reject = @var{rejlist} +@itemx --reject-regex @var{urlregex} +@itemx reject-regex = @var{urlregex} The @samp{--reject} option works the same way as @samp{--accept}, only its logic is the reverse; Wget will download all files @emph{except} the ones matching the suffixes (or patterns) in the list. @@ -2228,6 +2421,9 @@ Analogously, to download all files except the ones beginning with expansion by the shell. @end table +The argument to @samp{--accept-regex} option is a regular expression which +is matched against the complete URL. + @noindent The @samp{-A} and @samp{-R} options may be combined to achieve even better fine-tuning of which files to retrieve. E.g. @samp{wget -A @@ -2264,7 +2460,7 @@ ways, all of which can change whether an accept/reject rule matches: If the local file already exists and @samp{--no-directories} was specified, a numeric suffix will be appended to the original name. @item -If @samp{--html-extension} was specified, the local filename will have +If @samp{--adjust-extension} was specified, the local filename might have @samp{.html} appended to it. If Wget is invoked with @samp{-E -A.php}, a filename such as @samp{index.php} will match be accepted, but upon download will be named @samp{index.php.html}, which no longer matches, @@ -2449,16 +2645,16 @@ The time-stamping in GNU Wget is turned on using @samp{--timestamping} (@samp{-N}) option, or through @code{timestamping = on} directive in @file{.wgetrc}. With this option, for each file it intends to download, Wget will check whether a local file of the same name exists. If it -does, and the remote file is older, Wget will not download it. +does, and the remote file is not newer, Wget will not download it. If the local file does not exist, or the sizes of the files do not match, Wget will download the remote file no matter what the time-stamps say. @menu -* Time-Stamping Usage:: -* HTTP Time-Stamping Internals:: -* FTP Time-Stamping Internals:: +* Time-Stamping Usage:: +* HTTP Time-Stamping Internals:: +* FTP Time-Stamping Internals:: @end menu @node Time-Stamping Usage, HTTP Time-Stamping Internals, Time-Stamping, Time-Stamping @@ -2600,10 +2796,10 @@ Wget reads @file{.wgetrc} upon startup, recognizing a limited set of commands. @menu -* Wgetrc Location:: Location of various wgetrc files. -* Wgetrc Syntax:: Syntax of wgetrc. -* Wgetrc Commands:: List of available commands. -* Sample Wgetrc:: A wgetrc example. +* Wgetrc Location:: Location of various wgetrc files. +* Wgetrc Syntax:: Syntax of wgetrc. +* Wgetrc Commands:: List of available commands. +* Sample Wgetrc:: A wgetrc example. @end menu @node Wgetrc Location, Wgetrc Syntax, Startup File, Startup File @@ -2696,9 +2892,11 @@ enables it). Enable/disable saving pre-converted files with the suffix @samp{.orig}---the same as @samp{-K} (which enables it). -@c @item backups = @var{number} -@c #### Document me! -@c +@item backups = @var{number} +Use up to @var{number} backups for a file. Backups are rotated by +adding an incremental counter that starts at @samp{1}. The default is +@samp{0}. + @item base = @var{string} Consider relative @sc{url}s in input files (specified via the @samp{input} command or the @samp{--input-file}/@samp{-i} option, @@ -2741,6 +2939,10 @@ Set the connect timeout---the same as @samp{--connect-timeout}. Turn on recognition of the (non-standard) @samp{Content-Disposition} HTTP header---if set to @samp{on}, the same as @samp{--content-disposition}. +@item trust_server_names = on/off +If set to on, use the last component of a redirection URL for the local +file name. + @item continue = on/off If set to on, force continuation of preexistent partially retrieved files. See @samp{-c} before setting it. @@ -2845,10 +3047,12 @@ Turn globbing on/off---the same as @samp{--glob} and @samp{--no-glob}. Define a header for HTTP downloads, like using @samp{--header=@var{string}}. -@item html_extension = on/off +@item adjust_extension = on/off Add a @samp{.html} extension to @samp{text/html} or -@samp{application/xhtml+xml} files without it, or a @samp{.css} -extension to @samp{text/css} files without it, like @samp{-E}. +@samp{application/xhtml+xml} files that lack one, or a @samp{.css} +extension to @samp{text/css} files that lack one, like +@samp{-E}. Previously named @samp{html_extension} (still acceptable, +but deprecated). @item http_keep_alive = on/off Turn the keep-alive feature on or off (defaults to on). Turning it @@ -2954,7 +3158,7 @@ display properly---the same as @samp{-p}. Change setting of passive @sc{ftp}, equivalent to the @samp{--passive-ftp} option. -@itemx password = @var{string} +@item password = @var{string} Specify password @var{string} for both @sc{ftp} and @sc{http} file retrieval. This command can be overridden using the @samp{ftp_password} and @samp{http_password} command for @sc{ftp} and @sc{http} respectively. @@ -3081,6 +3285,10 @@ as @samp{--secure-protocol=@var{string}}. Choose whether or not to print the @sc{http} and @sc{ftp} server responses---the same as @samp{-S}. +@item show_all_dns_entries = on/off +When a DNS name is resolved, show all the IP addresses, not just the first +three. + @item span_hosts = on/off Same as @samp{-H}. @@ -3097,6 +3305,10 @@ Set all applicable timeout values to @var{n}, the same as @samp{-T @item timestamping = on/off Turn timestamping on/off. The same as @samp{-N} (@pxref{Time-Stamping}). +@item use_server_timestamps = on/off +If set to @samp{off}, Wget won't set the local file's timestamp by the +one on the server (same as @samp{--no-use-server-timestamps}). + @item tries = @var{n} Set number of retries per @sc{url}---the same as @samp{-t @var{n}}. @@ -3153,9 +3365,9 @@ The examples are divided into three sections loosely based on their complexity. @menu -* Simple Usage:: Simple, basic usage of the program. -* Advanced Usage:: Advanced tips. -* Very Advanced Usage:: The hairy stuff. +* Simple Usage:: Simple, basic usage of the program. +* Advanced Usage:: Advanced tips. +* Very Advanced Usage:: The hairy stuff. @end menu @node Simple Usage, Advanced Usage, Examples, Examples @@ -3403,14 +3615,14 @@ wget -m -k -K -E http://www.gnu.org/ -o /home/me/weeklog This chapter contains all the stuff that could not fit anywhere else. @menu -* Proxies:: Support for proxy servers. -* Distribution:: Getting the latest version. -* Web Site:: GNU Wget's presence on the World Wide Web. -* Mailing Lists:: Wget mailing list for announcements and discussion. -* Internet Relay Chat:: Wget's presence on IRC. -* Reporting Bugs:: How and where to report bugs. -* Portability:: The systems Wget works on. -* Signals:: Signal-handling performed by Wget. +* Proxies:: Support for proxy servers. +* Distribution:: Getting the latest version. +* Web Site:: GNU Wget's presence on the World Wide Web. +* Mailing Lists:: Wget mailing list for announcements and discussion. +* Internet Relay Chat:: Wget's presence on IRC. +* Reporting Bugs:: How and where to report bugs. +* Portability:: The systems Wget works on. +* Signals:: Signal-handling performed by Wget. @end menu @node Proxies, Distribution, Various, Various @@ -3428,34 +3640,36 @@ internal networks from the rest of Internet. In order to obtain information from the Web, their users connect and retrieve remote data using an authorized proxy. +@c man begin ENVIRONMENT Wget supports proxies for both @sc{http} and @sc{ftp} retrievals. The standard way to specify proxy location, which Wget recognizes, is using the following environment variables: -@table @code +@table @env @item http_proxy @itemx https_proxy -If set, the @code{http_proxy} and @code{https_proxy} variables should +If set, the @env{http_proxy} and @env{https_proxy} variables should contain the @sc{url}s of the proxies for @sc{http} and @sc{https} connections respectively. @item ftp_proxy This variable should contain the @sc{url} of the proxy for @sc{ftp} -connections. It is quite common that @code{http_proxy} and -@code{ftp_proxy} are set to the same @sc{url}. +connections. It is quite common that @env{http_proxy} and +@env{ftp_proxy} are set to the same @sc{url}. @item no_proxy This variable should contain a comma-separated list of domain extensions proxy should @emph{not} be used for. For instance, if the value of -@code{no_proxy} is @samp{.mit.edu}, proxy will not be used to retrieve +@env{no_proxy} is @samp{.mit.edu}, proxy will not be used to retrieve documents from MIT. @end table +@c man end In addition to the environment variables, proxy location and settings may be specified from within Wget itself. @table @samp -@itemx --no-proxy +@item --no-proxy @itemx proxy = on/off This option and the corresponding command may be used to suppress the use of proxy, even if the appropriate environment variables are set. @@ -3692,9 +3906,9 @@ Other than that, Wget will not try to interfere with signals in any way. This chapter contains some references I consider useful. @menu -* Robot Exclusion:: Wget's support for RES. -* Security Considerations:: Security with Wget. -* Contributors:: People who helped. +* Robot Exclusion:: Wget's support for RES. +* Security Considerations:: Security with Wget. +* Contributors:: People who helped. @end menu @node Robot Exclusion, Security Considerations, Appendices, Appendices @@ -3819,9 +4033,8 @@ me). GNU Wget was written by Hrvoje Nik@v{s}i@'{c} @email{hniksic@@xemacs.org}, @end iftex @ifnottex -GNU Wget was written by Hrvoje Niksic @email{hniksic@@xemacs.org}, +GNU Wget was written by Hrvoje Niksic @email{hniksic@@xemacs.org}. @end ifnottex -and it is currently maintained by Micah Cowan @email{micah@@cowan.name}. However, the development of Wget could never have gone as far as it has, were it not for the help of many people, either with bug reports, feature proposals,