non-standard port.
** Wget now supports the robots.txt directives specified in
-<http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html>.
+<http://www.robotstxt.org/wc/norobots-rfc.txt>.
** URL parser has been fixed, especially the infamous overzealous
quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
Until version 1.8, Wget supported the first version of the standard,
written by Martijn Koster in 1994 and available at
-@url{http://info.webcrawler.com/mak/projects/robots/norobots.html}. As
-of version 1.8, Wget has supported the additional directives specified
-in the internet draft @samp{<draft-koster-robots-00.txt>} titled ``A
-Method for Web Robots Control''. The draft, which has as far as I know
-never made to an @sc{rfc}, is available at
-@url{http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html}.
+@url{http://www.robotstxt.org/wc/norobots.html}. As of version 1.8,
+Wget has supported the additional directives specified in the internet
+draft @samp{<draft-koster-robots-00.txt>} titled ``A Method for Web
+Robots Control''. The draft, which has as far as I know never made to
+an @sc{rfc}, is available at
+@url{http://www.robotstxt.org/wc/norobots-rfc.txt}.
This manual no longer includes the text of the Robot Exclusion Standard.
@end example
This is explained in some detail at
-@url{http://info.webcrawler.com/mak/projects/robots/meta-user.html}.
-Wget supports this method of robot exclusion in addition to the usual
-@file{/robots.txt} exclusion.
+@url{http://www.robotstxt.org/wc/meta-user.html}. Wget supports this
+method of robot exclusion in addition to the usual @file{/robots.txt}
+exclusion.
@node Security Considerations, Contributors, Robots, Appendices
@section Security Considerations
/* The inner matching engine: return non-zero if RECORD_PATH matches
URL_PATH. The rules for matching are described at
- <http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html>,
- section 3.2.2. */
+ <http://www.robotstxt.org/wc/norobots-rfc.txt>, section 3.2.2. */
static int
matches (const char *record_path, const char *url_path)