+2000-08-22 Dan Harkless <dan-wget@dilvish.speed.net>
+
+ * wget.texi (Download Options): --no-clobber's documentation was
+ severely lacking -- ameliorated the situation. Some of the
+ previously-undocumented stuff (like the multiple-file-version
+ numeric-suffixing) that's now mentioned for the first (and only)
+ time in the -nc documentation should probably be mentioned
+ elsewhere, but due to the way that wget.texi's hierarchy is laid
+ out, I had a hard time finding anywhere else appropriate.
+
2000-07-17 Dan Harkless <dan-wget@dilvish.speed.net>
* wget.texi (HTTP Options): Minor clarification in "download a
\1f
Indirect:
wget.info-1: 961
-wget.info-2: 50079
-wget.info-3: 92081
+wget.info-2: 49932
+wget.info-3: 93404
\1f
Tag Table:
(Indirect)
Node: Basic Startup Options\7f9587
Node: Logging and Input File Options\7f10287
Node: Download Options\7f12681
-Node: Directory Options\7f19043
-Node: HTTP Options\7f21521
-Node: FTP Options\7f25426
-Node: Recursive Retrieval Options\7f26619
-Node: Recursive Accept/Reject Options\7f28583
-Node: Recursive Retrieval\7f31481
-Node: Following Links\7f33779
-Node: Relative Links\7f34807
-Node: Host Checking\7f35321
-Node: Domain Acceptance\7f37346
-Node: All Hosts\7f39016
-Node: Types of Files\7f39443
-Node: Directory-Based Limits\7f41893
-Node: FTP Links\7f44533
-Node: Time-Stamping\7f45403
-Node: Time-Stamping Usage\7f47040
-Node: HTTP Time-Stamping Internals\7f48609
-Node: FTP Time-Stamping Internals\7f50079
-Node: Startup File\7f51287
-Node: Wgetrc Location\7f52160
-Node: Wgetrc Syntax\7f52975
-Node: Wgetrc Commands\7f53690
-Node: Sample Wgetrc\7f60972
-Node: Examples\7f65991
-Node: Simple Usage\7f66598
-Node: Advanced Usage\7f68992
-Node: Guru Usage\7f71743
-Node: Various\7f73405
-Node: Proxies\7f73929
-Node: Distribution\7f76694
-Node: Mailing List\7f77045
-Node: Reporting Bugs\7f77744
-Node: Portability\7f79529
-Node: Signals\7f80904
-Node: Appendices\7f81558
-Node: Robots\7f81973
-Node: Introduction to RES\7f83120
-Node: RES Format\7f85013
-Node: User-Agent Field\7f86117
-Node: Disallow Field\7f86881
-Node: Norobots Examples\7f87492
-Node: Security Considerations\7f88446
-Node: Contributors\7f89442
-Node: Copying\7f92081
-Node: Concept Index\7f111244
+Node: Directory Options\7f20366
+Node: HTTP Options\7f22844
+Node: FTP Options\7f26749
+Node: Recursive Retrieval Options\7f27942
+Node: Recursive Accept/Reject Options\7f29906
+Node: Recursive Retrieval\7f32804
+Node: Following Links\7f35102
+Node: Relative Links\7f36130
+Node: Host Checking\7f36644
+Node: Domain Acceptance\7f38669
+Node: All Hosts\7f40339
+Node: Types of Files\7f40766
+Node: Directory-Based Limits\7f43216
+Node: FTP Links\7f45856
+Node: Time-Stamping\7f46726
+Node: Time-Stamping Usage\7f48363
+Node: HTTP Time-Stamping Internals\7f49932
+Node: FTP Time-Stamping Internals\7f51402
+Node: Startup File\7f52610
+Node: Wgetrc Location\7f53483
+Node: Wgetrc Syntax\7f54298
+Node: Wgetrc Commands\7f55013
+Node: Sample Wgetrc\7f62295
+Node: Examples\7f67314
+Node: Simple Usage\7f67921
+Node: Advanced Usage\7f70315
+Node: Guru Usage\7f73066
+Node: Various\7f74728
+Node: Proxies\7f75252
+Node: Distribution\7f78017
+Node: Mailing List\7f78368
+Node: Reporting Bugs\7f79067
+Node: Portability\7f80852
+Node: Signals\7f82227
+Node: Appendices\7f82881
+Node: Robots\7f83296
+Node: Introduction to RES\7f84443
+Node: RES Format\7f86336
+Node: User-Agent Field\7f87440
+Node: Disallow Field\7f88204
+Node: Norobots Examples\7f88815
+Node: Security Considerations\7f89769
+Node: Contributors\7f90765
+Node: Copying\7f93404
+Node: Concept Index\7f112567
\1f
End Tag Table
`-nc'
`--no-clobber'
- Do not clobber existing files when saving to directory hierarchy
- within recursive retrieval of several files. This option is
- *extremely* useful when you wish to continue where you left off
- with retrieval of many files. If the files have the `.html' or
- (yuck) `.htm' suffix, they will be loaded from the local disk, and
- parsed as if they have been retrieved from the Web.
+ If a file is downloaded more than once in the same directory,
+ wget's behavior depends on a few options, including `-nc'. In
+ certain cases, the local file will be "clobbered", or overwritten,
+ upon repeated download. In other cases it will be preserved.
+
+ When running wget without `-N', `-nc', or `-r', downloading the
+ same file in the same directory will result in the original copy
+ of `FILE' being preserved and the second copy being named
+ `FILE.1'. If that file is downloaded yet again, the third copy
+ will be named `FILE.2', and so on. When `-nc' is specified, this
+ behavior is suppressed, and wget will refuse to download newer
+ copies of `FILE'. Therefore, "no-clobber" is actually a misnomer
+ in this mode - it's not clobbering that's prevented (as the
+ numeric suffixes were already preventing clobbering), but rather
+ the multiple version saving that's prevented.
+
+ When running wget with `-r', but without `-N' or `-nc',
+ re-downloading a file will result in the new copy simply
+ overwriting the old. Adding `-nc' will prevent this behavior,
+ instead causing the original version to be preserved and any newer
+ copies on the server to be ignored.
+
+ When running wget with `-N', with or without `-r', the decision as
+ to whether or not to download a newer copy of a file depends on
+ the local and remote timestamp and size of the file (*Note
+ Time-Stamping::). `-nc' may not be specified at the same time as
+ `-N'.
+
+ Note that when `-nc' is specified, files with the suffixes `.html'
+ or (yuck) `.htm' will be loaded from the local disk and parsed as
+ if they had been retrieved from the Web.
`-c'
`--continue'
wget --timestamping -r ftp://prep.ai.mit.edu/pub/gnu/
-\1f
-File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
-
-HTTP Time-Stamping Internals
-============================
-
- Time-stamping in HTTP is implemented by checking of the
-`Last-Modified' header. If you wish to retrieve the file `foo.html'
-through HTTP, Wget will check whether `foo.html' exists locally. If it
-doesn't, `foo.html' will be retrieved unconditionally.
-
- If the file does exist locally, Wget will first check its local
-time-stamp (similar to the way `ls -l' checks it), and then send a
-`HEAD' request to the remote server, demanding the information on the
-remote file.
-
- The `Last-Modified' header is examined to find which file was
-modified more recently (which makes it "newer"). If the remote file is
-newer, it will be downloaded; if it is older, Wget will give up.(1)
-
- When `--backup-converted' (`-K') is specified in conjunction with
-`-N', server file `X' is compared to local file `X.orig', if extant,
-rather than being compared to local file `X', which will always differ
-if it's been converted by `--convert-links' (`-k').
-
- Arguably, HTTP time-stamping should be implemented using the
-`If-Modified-Since' request.
-
- ---------- Footnotes ----------
-
- (1) As an additional check, Wget will look at the `Content-Length'
-header, and compare the sizes; if they are not the same, the remote
-file will be downloaded no matter what the time-stamp says.
-
resulting derived work is distributed under the terms of a permission
notice identical to this one.
+\1f
+File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
+
+HTTP Time-Stamping Internals
+============================
+
+ Time-stamping in HTTP is implemented by checking of the
+`Last-Modified' header. If you wish to retrieve the file `foo.html'
+through HTTP, Wget will check whether `foo.html' exists locally. If it
+doesn't, `foo.html' will be retrieved unconditionally.
+
+ If the file does exist locally, Wget will first check its local
+time-stamp (similar to the way `ls -l' checks it), and then send a
+`HEAD' request to the remote server, demanding the information on the
+remote file.
+
+ The `Last-Modified' header is examined to find which file was
+modified more recently (which makes it "newer"). If the remote file is
+newer, it will be downloaded; if it is older, Wget will give up.(1)
+
+ When `--backup-converted' (`-K') is specified in conjunction with
+`-N', server file `X' is compared to local file `X.orig', if extant,
+rather than being compared to local file `X', which will always differ
+if it's been converted by `--convert-links' (`-k').
+
+ Arguably, HTTP time-stamping should be implemented using the
+`If-Modified-Since' request.
+
+ ---------- Footnotes ----------
+
+ (1) As an additional check, Wget will look at the `Content-Length'
+header, and compare the sizes; if they are not the same, the remote
+file will be downloaded no matter what the time-stamp says.
+
\1f
File: wget.info, Node: FTP Time-Stamping Internals, Prev: HTTP Time-Stamping Internals, Up: Time-Stamping
* bug reports: Reporting Bugs.
* bugs: Reporting Bugs.
* cache: HTTP Options.
+* clobbering, file: Download Options.
* command line: Invoking.
* Content-Length, ignore: HTTP Options.
* continue retrieval: Download Options.
* directory prefix: Directory Options.
* DNS lookup: Host Checking.
* dot style: Download Options.
+* downloading multiple times: Download Options.
* examples: Examples.
* exclude directories: Directory-Based Limits.
* execute wgetrc command: Basic Startup Options.
the documents will be written to standard output. Including this option
automatically sets the number of tries to 1.
+@cindex clobbering, file
+@cindex downloading multiple times
@cindex no-clobber
@item -nc
@itemx --no-clobber
-Do not clobber existing files when saving to directory hierarchy within
-recursive retrieval of several files. This option is @emph{extremely}
-useful when you wish to continue where you left off with retrieval of
-many files. If the files have the @samp{.html} or (yuck) @samp{.htm}
-suffix, they will be loaded from the local disk, and parsed as if they
-have been retrieved from the Web.
+If a file is downloaded more than once in the same directory, wget's
+behavior depends on a few options, including @samp{-nc}. In certain
+cases, the local file will be "clobbered", or overwritten, upon repeated
+download. In other cases it will be preserved.
+
+When running wget without @samp{-N}, @samp{-nc}, or @samp{-r},
+downloading the same file in the same directory will result in the
+original copy of @samp{@var{file}} being preserved and the second copy
+being named @samp{@var{file}.1}. If that file is downloaded yet again,
+the third copy will be named @samp{@var{file}.2}, and so on. When
+@samp{-nc} is specified, this behavior is suppressed, and wget will
+refuse to download newer copies of @samp{@var{file}}. Therefore,
+"no-clobber" is actually a misnomer in this mode -- it's not clobbering
+that's prevented (as the numeric suffixes were already preventing
+clobbering), but rather the multiple version saving that's prevented.
+
+When running wget with @samp{-r}, but without @samp{-N} or @samp{-nc},
+re-downloading a file will result in the new copy simply overwriting the
+old. Adding @samp{-nc} will prevent this behavior, instead causing the
+original version to be preserved and any newer copies on the server to
+be ignored.
+
+When running wget with @samp{-N}, with or without @samp{-r}, the
+decision as to whether or not to download a newer copy of a file depends
+on the local and remote timestamp and size of the file
+(@xref{Time-Stamping}). @samp{-nc} may not be specified at the same
+time as @samp{-N}.
+
+Note that when @samp{-nc} is specified, files with the suffixes
+@samp{.html} or (yuck) @samp{.htm} will be loaded from the local disk
+and parsed as if they had been retrieved from the Web.
@cindex continue retrieval
@item -c