\1f
Indirect:
wget.info-1: 961
-wget.info-2: 50395
-wget.info-3: 89018
+wget.info-2: 50388
+wget.info-3: 89138
\1f
Tag Table:
(Indirect)
Node: Overview\7f1850
Node: Invoking\7f5024
Node: URL Format\7f5833
-Node: Option Syntax\7f8164
-Node: Basic Startup Options\7f9588
-Node: Logging and Input File Options\7f10288
-Node: Download Options\7f12682
-Node: Directory Options\7f18467
-Node: HTTP Options\7f20945
-Node: FTP Options\7f24541
-Node: Recursive Retrieval Options\7f25734
-Node: Recursive Accept/Reject Options\7f27623
-Node: Recursive Retrieval\7f29706
-Node: Following Links\7f32002
-Node: Relative Links\7f33034
-Node: Host Checking\7f33548
-Node: Domain Acceptance\7f35581
-Node: All Hosts\7f37251
-Node: Types of Files\7f37678
-Node: Directory-Based Limits\7f40128
-Node: FTP Links\7f42768
-Node: Time-Stamping\7f43638
-Node: Time-Stamping Usage\7f45275
-Node: HTTP Time-Stamping Internals\7f46844
-Node: FTP Time-Stamping Internals\7f48314
-Node: Startup File\7f49522
-Node: Wgetrc Location\7f50395
-Node: Wgetrc Syntax\7f51210
-Node: Wgetrc Commands\7f51925
-Node: Sample Wgetrc\7f58763
-Node: Examples\7f63055
-Node: Simple Usage\7f63662
-Node: Advanced Usage\7f66056
-Node: Guru Usage\7f68807
-Node: Various\7f70469
-Node: Proxies\7f70993
-Node: Distribution\7f73758
-Node: Mailing List\7f74108
-Node: Reporting Bugs\7f74807
-Node: Portability\7f76592
-Node: Signals\7f77967
-Node: Appendices\7f78621
-Node: Robots\7f79036
-Node: Introduction to RES\7f80183
-Node: RES Format\7f82076
-Node: User-Agent Field\7f83180
-Node: Disallow Field\7f83944
-Node: Norobots Examples\7f84555
-Node: Security Considerations\7f85509
-Node: Contributors\7f86505
-Node: Copying\7f89018
-Node: Concept Index\7f108181
+Node: Option Syntax\7f8163
+Node: Basic Startup Options\7f9587
+Node: Logging and Input File Options\7f10287
+Node: Download Options\7f12681
+Node: Directory Options\7f18466
+Node: HTTP Options\7f20944
+Node: FTP Options\7f24540
+Node: Recursive Retrieval Options\7f25733
+Node: Recursive Accept/Reject Options\7f27622
+Node: Recursive Retrieval\7f29705
+Node: Following Links\7f32003
+Node: Relative Links\7f33035
+Node: Host Checking\7f33549
+Node: Domain Acceptance\7f35574
+Node: All Hosts\7f37244
+Node: Types of Files\7f37671
+Node: Directory-Based Limits\7f40121
+Node: FTP Links\7f42761
+Node: Time-Stamping\7f43631
+Node: Time-Stamping Usage\7f45268
+Node: HTTP Time-Stamping Internals\7f46837
+Node: FTP Time-Stamping Internals\7f48307
+Node: Startup File\7f49515
+Node: Wgetrc Location\7f50388
+Node: Wgetrc Syntax\7f51203
+Node: Wgetrc Commands\7f51918
+Node: Sample Wgetrc\7f58756
+Node: Examples\7f63048
+Node: Simple Usage\7f63655
+Node: Advanced Usage\7f66049
+Node: Guru Usage\7f68800
+Node: Various\7f70462
+Node: Proxies\7f70986
+Node: Distribution\7f73751
+Node: Mailing List\7f74102
+Node: Reporting Bugs\7f74801
+Node: Portability\7f76586
+Node: Signals\7f77961
+Node: Appendices\7f78615
+Node: Robots\7f79030
+Node: Introduction to RES\7f80177
+Node: RES Format\7f82070
+Node: User-Agent Field\7f83174
+Node: Disallow Field\7f83938
+Node: Norobots Examples\7f84549
+Node: Security Considerations\7f85503
+Node: Contributors\7f86499
+Node: Copying\7f89138
+Node: Concept Index\7f108301
\1f
End Tag Table
ftp://host/directory/file;type=a
Two alternative variants of URL specification are also supported,
-because of historical (hysterical?) reasons and their wide-spreadedness.
+because of historical (hysterical?) reasons and their widespreaded use.
FTP-only syntax (supported by `NcFTP'):
host:/dir/file
same stands for the foreign server you are mirroring--the more requests
it gets in a rows, the greater is its load.
- Careless retrieving can also fill your file system unctrollably,
+ Careless retrieving can also fill your file system uncontrollably,
which can grind the machine to a halt.
The load can be minimized by lowering the maximum recursion level
The drawback of following the relative links solely is that humans
often tend to mix them with absolute links to the very same host, and
the very same page. In this mode (which is the default mode for
-following links) all URLs the that refer to the same host will be
-retrieved.
+following links) all URLs that refer to the same host will be retrieved.
The problem with this option are the aliases of the hosts and
domains. Thus there is no way for Wget to know that `regoc.srce.hr' and
dealing with the same hosts. Although the results of `gethostbyname'
are cached, it is still a great slowdown, e.g. when dealing with large
indices of home pages on different hosts (because each of the hosts
-must be and DNS-resolved to see whether it just *might* an alias of the
+must be DNS-resolved to see whether it just *might* be an alias of the
starting host).
To avoid the overhead you may use `-nh', which will turn off
things run much faster, but also much less reliable (e.g. `www.srce.hr'
and `regoc.srce.hr' will be flagged as different hosts).
- Note that modern HTTP servers allows one IP address to host several
-"virtual servers", each having its own directory hieratchy. Such
+ Note that modern HTTP servers allow one IP address to host several
+"virtual servers", each having its own directory hierarchy. Such
"servers" are distinguished by their hostnames (all of which point to
the same IP address); for this to work, a client must send a `Host'
header, which is what Wget does. However, in that case Wget *must not*
try to divine a host's "real" address, nor try to use the same hostname
for each access, i.e. `-nh' must be turned on.
- In other words, the `-nh' option must be used to enabling the
+ In other words, the `-nh' option must be used to enable the
retrieval from virtual servers distinguished by their hostnames. As the
number of such server setups grow, the behavior of `-nh' may become the
default in the future.
When downloading material from the web, you will often want to
restrict the retrieval to only certain file types. For example, if you
-are interested in downloading GIFS, you will not be overjoyed to get
-loads of Postscript documents, and vice versa.
+are interested in downloading GIFs, you will not be overjoyed to get
+loads of PostScript documents, and vice versa.
Wget offers two options to deal with this problem. Each option
description lists a short name, a long name, and the equivalent command
The `-A' and `-R' options may be combined to achieve even better
fine-tuning of which files to retrieve. E.g. `wget -A "*zelazny*" -R
.ps' will download all the files having `zelazny' as a part of their
-name, but *not* the postscript files.
+name, but *not* the PostScript files.
Note that these two options do not affect the downloading of HTML
files; Wget must load all the HTMLs to know where to go at
`no_parent = on'
The simplest, and often very useful way of limiting directories is
disallowing retrieval of the links that refer to the hierarchy
- "upper" than the beginning directory, i.e. disallowing ascent to
+ "above" than the beginning directory, i.e. disallowing ascent to
the parent directory/directories.
The `--no-parent' option (short `-np') is useful in this case.
Like all GNU utilities, the latest version of Wget can be found at
the master GNU archive site prep.ai.mit.edu, and its mirrors. For
example, Wget 1.5.3+dev can be found at
-`ftp://prep.ai.mit.edu/pub/gnu/wget-1.5.3+dev.tar.gz'
+`ftp://prep.ai.mit.edu/gnu/wget/wget-1.5.3+dev.tar.gz'
\1f
File: wget.info, Node: Mailing List, Next: Reporting Bugs, Prev: Distribution, Up: Various
The description of the norobots standard was written, and is
maintained by Martijn Koster <m.koster@webcrawler.com>. With his
-permission, I contribute a (slightly modified) texified version of the
+permission, I contribute a (slightly modified) TeXified version of the
RES.
* Menu:
The field name is case insensitive.
- Comments can be included in file using UNIX bourne shell conventions:
+ Comments can be included in file using UNIX Bourne shell conventions:
the `#' character is used to indicate that preceding space (if any) and
the remainder of the line up to the line termination is discarded.
Lines containing only a comment are discarded completely, and therefore
* Darko Budor--initial port to Windows.
- * Antonio Rosella--help and suggestions, plust the Italian
+ * Antonio Rosella--help and suggestions, plus the Italian
translation.
* Tomislav Petrovic, Mario Mikocevic--many bug reports and
that make maintenance so much fun:
Tim Adam, Martin Baehr, Dieter Baron, Roger Beeman and the Gurus at
-Cisco, Mark Boyns, John Burden, Wanderlei Cavassin, Gilles Cedoc, Tim
-Charron, Noel Cragg, Kristijan Conkas, Damir Dzeko, Andrew Davison,
-Ulrich Drepper, Marc Duponcheel, Aleksandar Erkalovic, Andy Eskilsson,
-Masashi Fujita, Howard Gayle, Marcel Gerrits, Hans Grobler, Mathieu
-Guillaume, Karl Heuer, Gregor Hoffleit, Erik Magnus Hulthen, Richard
-Huveneers, Simon Josefsson, Mario Juric, Goran Kezunovic, Robert Kleine,
-Fila Kolodny, Alexander Kourakos, Martin Kraemer, Simos KSenitellis,
-Tage Stabell-Kulo, Hrvoje Lacko, Dave Love, Jordan Mendelson, Lin Zhe
-Min, Charlie Negyesi, Andrew Pollock, Steve Pothier, Marin Purgar, Jan
-Prikryl, Keith Refson, Tobias Ringstrom, Juan Jose Rodrigues, Heinz
-Salzmann, Robert Schmidt, Toomas Soome, Sven Sternberger, Markus
-Strasser, Szakacsits Szabolcs, Mike Thomas, Russell Vincent, Douglas E.
-Wegscheid, Jasmin Zainul, Bojan Zdrnja, Kristijan Zimmer.
+Cisco, Dan Berger, Mark Boyns, John Burden, Wanderlei Cavassin, Gilles
+Cedoc, Tim Charron, Noel Cragg, Kristijan Conkas, Andrew Deryabin,
+Damir Dzeko, Andrew Davison, Ulrich Drepper, Marc Duponcheel,
+Aleksandar Erkalovic, Andy Eskilsson, Masashi Fujita, Howard Gayle,
+Marcel Gerrits, Hans Grobler, Mathieu Guillaume, Dan Harkless, Heiko
+Herold, Karl Heuer, HIROSE Masaaki, Gregor Hoffleit, Erik Magnus
+Hulthen, Richard Huveneers, Simon Josefsson, Mario Juric, Goran
+Kezunovic, Robert Kleine, Fila Kolodny, Alexander Kourakos, Martin
+Kraemer, Simos KSenitellis, Hrvoje Lacko, Daniel S. Lewart, Dave Love,
+Jordan Mendelson, Lin Zhe Min, Charlie Negyesi, Andrew Pollock, Steve
+Pothier, Jan Prikryl, Marin Purgar, Keith Refson, Tobias Ringstrom,
+Juan Jose Rodrigues, Edward J. Sabol, Heinz Salzmann, Robert Schmidt,
+Toomas Soome, Tage Stabell-Kulo, Sven Sternberger, Markus Strasser,
+Szakacsits Szabolcs, Mike Thomas, Russell Vincent, Charles G Waldman,
+Douglas E. Wegscheid, Jasmin Zainul, Bojan Zdrnja, Kristijan Zimmer.
Apologies to all who I accidentally left out, and many thanks to all
the subscribers of the Wget mailing list.
+2000-03-02 Dan Harkless <dan-wget@dilvish.speed.net>
+
+ * ftp.c (ftp_loop_internal): Heiko introduced "suggest explicit
+ braces to avoid ambiguous `else'" warnings. Eliminated them.
+
+ * http.c (http_loop): Heiko introduced "suggest explicit
+ braces to avoid ambiguous `else'" warnings. Eliminated them.
+
+ * main.c: Heiko's --wait / --waitretry backwards compatibility
+ code looks to have been totally untested -- automatic variable
+ 'wr' was used without being initialized, and a long int was passed
+ into setval()'s char* val parameter.
+
1999-08-25 Heiko Herold <Heiko.Herold@previnet.it>
* ftp.c: Respect new option waitretry.
retrieval).
Check if we are retrying or not, wait accordingly - HEH */
if (!first_retrieval && (opt.wait || (count && opt.waitretry)))
- if (count)
- if (count<opt.waitretry)
- sleep(count);
+ {
+ if (count)
+ {
+ if (count<opt.waitretry)
+ sleep(count);
+ else
+ sleep(opt.waitretry);
+ }
else
- sleep(opt.waitretry);
- else
- sleep (opt.wait);
+ sleep (opt.wait);
+ }
if (first_retrieval)
first_retrieval = 0;
if (con->st & ON_YOUR_OWN)
retrieval).
Check if we are retrying or not, wait accordingly - HEH */
if (!first_retrieval && (opt.wait || (count && opt.waitretry)))
- if (count)
- if (count<opt.waitretry)
- sleep(count);
+ {
+ if (count)
+ {
+ if (count<opt.waitretry)
+ sleep(count);
+ else
+ sleep(opt.waitretry);
+ }
else
- sleep(opt.waitretry);
- else
- sleep (opt.wait);
+ sleep (opt.wait);
+ }
if (first_retrieval)
first_retrieval = 0;
/* Get the current time string. */
{
char **url, **t;
int i, c, nurl, status, append_to_log;
- int wr;
+ int wr = 0;
static struct option long_options[] =
{
A simple check on the values is not enough, I could have set
wait to n>0 and waitretry to 0 - HEH */
if (opt.wait && !wr)
- setval ("waitretry", opt.wait);
+ {
+ char opt_wait_str[256]; /* bigger than needed buf to prevent overflow */
+
+ sprintf(opt_wait_str, "%ld", opt.wait);
+ setval ("waitretry", opt_wait_str);
+ }
/* Sanity checks. */
if (opt.verbose && opt.quiet)