Search Options

Results per page
Sort
Preferred Languages
Advance

Results 31 - 40 of 63 for Sitemap (0.04 sec)

  1. CLAUDE.md

    extractorFactory.addExtractor("text/html", tikaExtractor, 1);  // Fallback
    ```
    
    ### Helpers
    
    **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps
    **SitemapsHelper**: Sitemap XML parsing, index handling
    **MimeTypeHelper**: MIME detection via Tika
    **EncodingHelper**: Charset detection with BOM
    **UrlConvertHelper**: URL normalization
    
    ---
    
    ## Development Workflow
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  2. fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/RobotsTxtHelperTest.java

            assertFalse(robotsTxt.allows("/ddd", "Hoge Crawler"));
    
            String[] sitemaps = robotsTxt.getSitemaps();
            assertEquals(2, sitemaps.length);
            assertEquals("http://www.example.com/sitmap.xml", sitemaps[0]);
            assertEquals("http://www.example.net/sitmap.xml", sitemaps[1]);
    
        }
    
        public void testParse_disable() {
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 20.6K bytes
    - Viewed (0)
  3. fess-crawler/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java

                        rule.setResponseProcessor(container.getComponent("sitemapsResponseProcessor"));
                        rule.setRuleId("sitemapsRule");
                        rule.addRule("url", ".*sitemap.*");
                    })//
                    .<DefaultResponseProcessor> singleton("defaultResponseProcessor", DefaultResponseProcessor.class, processor -> {
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Tue Nov 11 13:40:14 UTC 2025
    - 25.8K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/HcHttpClient.java

                        if (robotsTxt != null) {
                            final String[] sitemaps = robotsTxt.getSitemaps();
                            if (sitemaps.length > 0) {
                                crawlerContext.addSitemaps(sitemaps);
                            }
    
                            final RobotsTxt.Directive directive = robotsTxt.getMatchedDirective(userAgent);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 12:19:14 UTC 2025
    - 53.7K bytes
    - Viewed (0)
  5. fess-crawler/src/test/resources/sitemaps/sitemap2.xml.gz

    sitemap2.xml http://www.example.com/sitemap1.xml.gz 2004-10-01T18:23:17+00:00 http://www.example.com/sitemap2.xml.gz 2005-01-01...
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Oct 11 02:16:55 UTC 2015
    - 217 bytes
    - Viewed (0)
  6. src/main/java/jcifs/smb1/util/MimeMap.java

        private static final int ST_EXT = 5;
    
        private final byte[] in;
        private int inLen;
    
        /**
         * Creates a new MimeMap instance by loading MIME type mappings from the resource file.
         *
         * @throws IOException if there is an error reading the mime.map resource file
         */
        public MimeMap() throws IOException {
            int n;
    
            in = new byte[IN_SIZE];
    Registered: Sat Dec 20 13:44:44 UTC 2025
    - Last Modified: Sat Aug 16 01:32:48 UTC 2025
    - 5.1K bytes
    - Viewed (0)
  7. fess-crawler/src/test/resources/sitemaps/sitemap1.xml.gz

    sitemap1.xml http://www.example.com/ 2005-01-01 monthly 0.8 http://www.example.com/catalog?item=12&desc=vacation_hawaii weekly http://www.example.com/catalog?item=73&desc=vacation_new_zealand 2004-12-23 weekly http://www.example.com/catalog?item=74&desc=vacation_newfoundland 2004-12-23T18:00:15+00:00 0.3 http://www.example.com/catalog?item=83&desc=vacation_usa 2004-11-23...
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Oct 11 02:16:55 UTC 2015
    - 332 bytes
    - Viewed (0)
  8. fess-crawler-lasta/src/main/resources/crawler/sitemaps.xml

    Shinsuke Sugaya <******@****.***> 1444529815 +0900
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Oct 11 02:16:55 UTC 2015
    - 365 bytes
    - Viewed (0)
  9. fess-crawler/src/test/resources/sitemaps/sitemap1.txt

    Shinsuke Sugaya <******@****.***> 1444529815 +0900
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Oct 11 02:16:55 UTC 2015
    - 273 bytes
    - Viewed (0)
  10. src/test/java/jcifs/smb1/util/MimeMapTest.java

                assertEquals("application/pdf", mimeMap.getMimeType("pdf"));
                assertEquals("application/msword", mimeMap.getMimeType("doc"));
                assertEquals("application/vnd.ms-excel", mimeMap.getMimeType("xls"));
                assertEquals("text/html", mimeMap.getMimeType("html"));
                assertEquals("text/html", mimeMap.getMimeType("htm"));
                assertEquals("image/jpeg", mimeMap.getMimeType("jpg"));
    Registered: Sat Dec 20 13:44:44 UTC 2025
    - Last Modified: Thu Aug 14 05:31:44 UTC 2025
    - 9.1K bytes
    - Viewed (0)
Back to top