Search Options

Display Count
Sort
Preferred Language
Advanced Search

Results 1 - 4 of 4 for RobotsTxtHelper (0.08 seconds)

  1. fess-crawler-lasta/src/main/resources/crawler/robotstxt.xml

    <!DOCTYPE components PUBLIC "-//DBFLUTE//DTD LastaDi 1.0//EN"
    	"http://dbflute.org/meta/lastadi10.dtd">
    <components namespace="fessCrawler">
    	<include path="crawler/container.xml" />
    
    	<component name="robotsTxtHelper" class="org.codelibs.fess.crawler.helper.RobotsTxtHelper"
    		instance="prototype">
    	</component>
    Created: Sun Apr 12 03:50:13 GMT 2026
    - Last Modified: Sun Oct 11 02:16:55 GMT 2015
    - 367 bytes
    - Click Count (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/Hc4HttpClient.java

         * Constructs a new Hc4HttpClient.
         */
        public Hc4HttpClient() {
            // Default constructor
        }
    
        /** Helper for processing robots.txt files */
        @Resource
        protected RobotsTxtHelper robotsTxtHelper;
    
        /** Helper for managing content length limits */
        @Resource
        protected ContentLengthHelper contentLengthHelper;
    
        /** Helper for determining MIME types */
        @Resource
    Created: Sun Apr 12 03:50:13 GMT 2026
    - Last Modified: Fri Jan 09 23:46:52 GMT 2026
    - 54.4K bytes
    - Click Count (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/Hc5HttpClient.java

         * Constructs a new Hc5HttpClient.
         */
        public Hc5HttpClient() {
            // Default constructor
        }
    
        /** Helper for processing robots.txt files */
        @Resource
        protected RobotsTxtHelper robotsTxtHelper;
    
        /** Helper for managing content length limits */
        @Resource
        protected ContentLengthHelper contentLengthHelper;
    
        /** Helper for determining MIME types */
        @Resource
    Created: Sun Apr 12 03:50:13 GMT 2026
    - Last Modified: Sat Jan 31 12:23:29 GMT 2026
    - 62.2K bytes
    - Click Count (0)
  4. CLAUDE.md

    ### Key Extractors
    
    `TikaExtractor`, `PdfExtractor`, `MsWordExtractor`, `MsExcelExtractor`, `MsPowerPointExtractor`, `ZipExtractor`, `HtmlExtractor`, `MarkdownExtractor`, `EmlExtractor`
    
    ### Helpers
    
    - **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps
    - **SitemapsHelper**: Sitemap XML parsing, index handling
    - **MimeTypeHelper**: MIME detection via Tika
    - **EncodingHelper**: Charset detection with BOM
    Created: Sun Apr 12 03:50:13 GMT 2026
    - Last Modified: Thu Mar 12 03:39:20 GMT 2026
    - 8.1K bytes
    - Click Count (0)
Back to Top