X-Git-Url: http://sjero.net/git/?p=wget;a=blobdiff_plain;f=doc%2Fwget.texi;h=73fc527866990aa7202d821dbb9c12571f57109a;hp=6e0dd0a080c330b9c9956a91e589b79bd62b9656;hb=e55befe5e204177c618bcbb9ea814ea8094bc48b;hpb=47bf277fb5fc7c223601c3cae6646d9121c31553 diff --git a/doc/wget.texi b/doc/wget.texi index 6e0dd0a0..73fc5278 100644 --- a/doc/wget.texi +++ b/doc/wget.texi @@ -396,8 +396,8 @@ the option name; negative options can be negated by omitting the @samp{--no-} prefix. This might seem superfluous---if the default for an affirmative option is to not do something, then why provide a way to explicitly turn it off? But the startup file may in fact change -the default. For instance, using @code{follow_ftp = off} in -@file{.wgetrc} makes Wget @emph{not} follow FTP links by default, and +the default. For instance, using @code{follow_ftp = on} in +@file{.wgetrc} makes Wget @emph{follow} FTP links by default, and using @samp{--no-follow-ftp} is the only way to restore the factory default from the command line. @@ -486,9 +486,8 @@ specified as @var{file}, @sc{url}s are read from the standard input. If this function is used, no @sc{url}s need be present on the command line. If there are @sc{url}s both on the command line and in an input file, those on the command lines will be the first ones to be -retrieved. The @var{file} need not be an @sc{html} document (but no -harm if it is)---it is enough if the @sc{url}s are just listed -sequentially. +retrieved. If @samp{--force-html} is not specified, then @var{file} +should consist of a series of URLs, one per line. However, if you specify @samp{--force-html}, the document will be regarded as @samp{html}. In that case you may have problems with @@ -513,8 +512,17 @@ option. @cindex base for relative links in input file @item -B @var{URL} @itemx --base=@var{URL} -Prepends @var{URL} to relative links read from the file specified with -the @samp{-i} option. +Resolves relative links using @var{URL} as the point of reference, +when reading links from an HTML file specified via the +@samp{-i}/@samp{--input-file} option (together with +@samp{--force-html}, or when the input file was fetched remotely from +a server describing it as @sc{html}). This is equivalent to the +presence of a @code{BASE} tag in the @sc{html} input file, with +@var{URL} as the value for the @code{href} attribute. + +For instance, if you specify @samp{http://foo/bar/a.html} for +@var{URL}, and Wget reads @samp{../baz/b.html} from the input file, it +would be resolved to @samp{http://foo/baz/b.html}. @end table @node Download Options, Directory Options, Logging and Input File Options, Invoking @@ -582,23 +590,24 @@ behavior depends on a few options, including @samp{-nc}. In certain cases, the local file will be @dfn{clobbered}, or overwritten, upon repeated download. In other cases it will be preserved. -When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or @samp{p}, -downloading the same file in the same directory will result in the -original copy of @var{file} being preserved and the second copy being -named @samp{@var{file}.1}. If that file is downloaded yet again, the -third copy will be named @samp{@var{file}.2}, and so on. When -@samp{-nc} is specified, this behavior is suppressed, and Wget will -refuse to download newer copies of @samp{@var{file}}. Therefore, -``@code{no-clobber}'' is actually a misnomer in this mode---it's not -clobbering that's prevented (as the numeric suffixes were already -preventing clobbering), but rather the multiple version saving that's -prevented. - -When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N} -or @samp{-nc}, re-downloading a file will result in the new copy -simply overwriting the old. Adding @samp{-nc} will prevent this -behavior, instead causing the original version to be preserved and any -newer copies on the server to be ignored. +When running Wget without @samp{-N}, @samp{-nc}, @samp{-r}, or +@samp{-p}, downloading the same file in the same directory will result +in the original copy of @var{file} being preserved and the second copy +being named @samp{@var{file}.1}. If that file is downloaded yet +again, the third copy will be named @samp{@var{file}.2}, and so on. +(This is also the behavior with @samp{-nd}, even if @samp{-r} or +@samp{-p} are in effect.) When @samp{-nc} is specified, this behavior +is suppressed, and Wget will refuse to download newer copies of +@samp{@var{file}}. Therefore, ``@code{no-clobber}'' is actually a +misnomer in this mode---it's not clobbering that's prevented (as the +numeric suffixes were already preventing clobbering), but rather the +multiple version saving that's prevented. + +When running Wget with @samp{-r} or @samp{-p}, but without @samp{-N}, +@samp{-nd}, or @samp{-nc}, re-downloading a file will result in the +new copy simply overwriting the old. Adding @samp{-nc} will prevent +this behavior, instead causing the original version to be preserved +and any newer copies on the server to be ignored. When running Wget with @samp{-N}, with or without @samp{-r} or @samp{-p}, the decision as to whether or not to download a newer copy @@ -996,6 +1005,46 @@ options for @sc{http} connections. @item --ask-password Prompt for a password for each connection established. Cannot be specified when @samp{--password} is being used, because they are mutually exclusive. + +@cindex iri support +@cindex idn support +@item --no-iri + +Turn off internationalized URI (IRI) support. Use @samp{--iri} to +turn it on. IRI support is activated by default. + +You can set the default state of IRI support using the @code{iri} +command in @file{.wgetrc}. That setting may be overridden from the +command line. + +@cindex local encoding +@item --local-encoding=@var{encoding} + +Force Wget to use @var{encoding} as the default system encoding. That affects +how Wget converts URLs specified as arguments from locale to @sc{utf-8} for +IRI support. + +Wget use the function @code{nl_langinfo()} and then the @code{CHARSET} +environment variable to get the locale. If it fails, @sc{ascii} is used. + +You can set the default local encoding using the @code{local_encoding} +command in @file{.wgetrc}. That setting may be overridden from the +command line. + +@cindex remote encoding +@item --remote-encoding=@var{encoding} + +Force Wget to use @var{encoding} as the default remote server encoding. +That affects how Wget converts URIs found in files from remote encoding +to @sc{utf-8} during a recursive fetch. This options is only useful for +IRI support, for the interpretation of non-@sc{ascii} characters. + +For HTTP, remote encoding can be found in HTTP @code{Content-Type} +header and in HTML @code{Content-Type http-equiv} meta tag. + +You can set the default encoding using the @code{remoteencoding} +command in @file{.wgetrc}. That setting may be overridden from the +command line. @end table @node Directory Options, HTTP Options, Download Options, Invoking @@ -1346,10 +1395,20 @@ not to send the @code{User-Agent} header in @sc{http} requests. @cindex POST @item --post-data=@var{string} @itemx --post-file=@var{file} -Use POST as the method for all HTTP requests and send the specified data -in the request body. @code{--post-data} sends @var{string} as data, -whereas @code{--post-file} sends the contents of @var{file}. Other than -that, they work in exactly the same way. +Use POST as the method for all HTTP requests and send the specified +data in the request body. @samp{--post-data} sends @var{string} as +data, whereas @samp{--post-file} sends the contents of @var{file}. +Other than that, they work in exactly the same way. In particular, +they @emph{both} expect content of the form @code{key1=value1&key2=value2}, +with percent-encoding for special characters; the only difference is +that one expects its content as a command-line paramter and the other +accepts its content from a file. In particular, @samp{--post-file} is +@emph{not} for transmitting files as form attachments: those must +appear as @code{key=value} data (with appropriate percent-coding) just +like everything else. Wget does not currently support +@code{multipart/form-data} for transmitting POST data; only +@code{application/x-www-form-urlencoded}. Only one of +@samp{--post-data} and @samp{--post-file} should be specified. Please be aware that Wget needs to know the size of the POST data in advance. Therefore the argument to @code{--post-file} must be a regular @@ -2601,6 +2660,16 @@ Same as @samp{-A}/@samp{-R} (@pxref{Types of Files}). @item add_hostdir = on/off Enable/disable host-prefixed file names. @samp{-nH} disables it. +@item ask_password = on/off +Prompt for a password for each connection established. Cannot be specified +when @samp{--password} is being used, because they are mutually +exclusive. Equivalent to @samp{--ask-password}. + +@item auth_no_challenge = on/off +If this option is given, Wget will send Basic HTTP authentication +information (plaintext username and password) for all requests. See +@samp{--auth-no-challenge}. + @item background = on/off Enable/disable going to background---the same as @samp{-b} (which enables it). @@ -2613,9 +2682,10 @@ Enable/disable saving pre-converted files with the suffix @c #### Document me! @c @item base = @var{string} -Consider relative @sc{url}s in @sc{url} input files forced to be -interpreted as @sc{html} as being relative to @var{string}---the same as -@samp{--base=@var{string}}. +Consider relative @sc{url}s in input files (specified via the +@samp{input} command or the @samp{--input-file}/@samp{-i} option, +together with @samp{force_html} or @samp{--force-html}) +as being relative to @var{string}---the same as @samp{--base=@var{string}}. @item bind_address = @var{address} Bind to @var{address}, like the @samp{--bind-address=@var{address}}. @@ -2798,6 +2868,10 @@ Ignore certain @sc{html} tags when doing a recursive retrieval, like Specify a comma-separated list of directories you wish to follow when downloading---the same as @samp{-I @var{string}}. +@item iri = on/off +When set to on, enable internationalized URI (IRI) support; the same as +@samp{--iri}. + @item inet4_only = on/off Force connecting to IPv4 addresses, off by default. You can put this in the global init file to disable Wget's attempts to resolve and @@ -2812,6 +2886,10 @@ or @samp{-6}. @item input = @var{file} Read the @sc{url}s from @var{string}, like @samp{-i @var{file}}. +@item keep_session_cookies = on/off +When specified, causes @samp{save_cookies = on} to also save session +cookies. See @samp{--keep-session-cookies}. + @item limit_rate = @var{rate} Limit the download speed to no more than @var{rate} bytes per second. The same as @samp{--limit-rate=@var{rate}}. @@ -2819,6 +2897,10 @@ The same as @samp{--limit-rate=@var{rate}}. @item load_cookies = @var{file} Load cookies from @var{file}. See @samp{--load-cookies @var{file}}. +@item local_encoding = @var{encoding} +Force Wget to use @var{encoding} as the default system encoding. See +@samp{--local-encoding}. + @item logfile = @var{file} Set logfile to @var{file}, the same as @samp{-o @var{file}}. @@ -2938,6 +3020,10 @@ the @sc{http} spec who got the spelling of ``referrer'' wrong.) Follow only relative links---the same as @samp{-L} (@pxref{Relative Links}). +@item remote_encoding = @var{encoding} +Force Wget to use @var{encoding} as the default remote server encoding. +See @samp{--remote-encoding}. + @item remove_listing = on/off If set to on, remove @sc{ftp} listings downloaded by Wget. Setting it to off is the same as @samp{--no-remove-listing}. @@ -3823,6 +3909,9 @@ Gnulib getpasswd-gnu module. @item Ted Mielczarek---donated support for CSS. +@item +Saint Xavier---Support for IRIs (RFC 3987). + @item People who provided donations for development---including Brian Gough. @end itemize @@ -3934,6 +4023,7 @@ Fila Kolodny, Alexander Kourakos, Martin Kraemer, Sami Krank, +Jay Krell, @tex $\Sigma\acute{\iota}\mu o\varsigma\; \Xi\varepsilon\nu\iota\tau\acute{\epsilon}\lambda\lambda\eta\varsigma$ @@ -3964,6 +4054,7 @@ Aurelien Marchand, Matthew J.@: Mellon, Jordan Mendelson, Ted Mielczarek, +Robert Millan, Lin Zhe Min, Jan Minar, Tim Mooney, @@ -4039,6 +4130,8 @@ Charles G Waldman, Douglas E.@: Wegscheid, Ralf Wildenhues, Joshua David Williams, +Benjamin Wolsey, +Saint Xavier, YAMAZAKI Makoto, Jasmin Zainul, @iftex @@ -4047,7 +4140,8 @@ Bojan @v{Z}drnja, @ifnottex Bojan Zdrnja, @end ifnottex -Kristijan Zimmer. +Kristijan Zimmer, +Xin Zou. Apologies to all who I accidentally left out, and many thanks to all the subscribers of the Wget mailing list.