- Sort Score
- Num 10 results
- Language All
Results 1 - 3 of 3 for robotsTxtHelper (0.08 seconds)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/Hc4HttpClient.java
* Constructs a new Hc4HttpClient. */ public Hc4HttpClient() { // Default constructor } /** Helper for processing robots.txt files */ @Resource protected RobotsTxtHelper robotsTxtHelper; /** Helper for managing content length limits */ @Resource protected ContentLengthHelper contentLengthHelper; /** Helper for determining MIME types */ @ResourceCreated: Sun Apr 12 03:50:13 GMT 2026 - Last Modified: Fri Jan 09 23:46:52 GMT 2026 - 54.4K bytes - Click Count (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/Hc5HttpClient.java
* Constructs a new Hc5HttpClient. */ public Hc5HttpClient() { // Default constructor } /** Helper for processing robots.txt files */ @Resource protected RobotsTxtHelper robotsTxtHelper; /** Helper for managing content length limits */ @Resource protected ContentLengthHelper contentLengthHelper; /** Helper for determining MIME types */ @ResourceCreated: Sun Apr 12 03:50:13 GMT 2026 - Last Modified: Sat Jan 31 12:23:29 GMT 2026 - 62.2K bytes - Click Count (0) -
CLAUDE.md
### Key Extractors `TikaExtractor`, `PdfExtractor`, `MsWordExtractor`, `MsExcelExtractor`, `MsPowerPointExtractor`, `ZipExtractor`, `HtmlExtractor`, `MarkdownExtractor`, `EmlExtractor` ### Helpers - **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps - **SitemapsHelper**: Sitemap XML parsing, index handling - **MimeTypeHelper**: MIME detection via Tika - **EncodingHelper**: Charset detection with BOM
Created: Sun Apr 12 03:50:13 GMT 2026 - Last Modified: Thu Mar 12 03:39:20 GMT 2026 - 8.1K bytes - Click Count (0)