- Sort Score
- Result 10 results
- Languages All
Results 31 - 34 of 34 for sitemaps (0.07 sec)
-
README.md
controller.setDefaultIntervalTime(1000); }); ``` ### Sitemap Support ```java // Enable sitemap processing container.singleton("sitemapsRule", SitemapsRule.class, rule -> { rule.addRule("url", ".*sitemap.*"); }); // Add sitemap URL crawler.addUrl("https://example.com/sitemap.xml"); ``` ## Data Access and Storage ### Accessing Crawled Data ```java
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler-lasta/src/main/resources/crawler/rule.xml
<component class="org.codelibs.fess.crawler.processor.impl.SitemapsResponseProcessor"> </component> </property> <postConstruct name="addRule"> <arg>"url"</arg> <arg>".*sitemap.*"</arg> </postConstruct> </component> <component name="fileRule" class="org.codelibs.fess.crawler.rule.impl.RegexRule"> <property name="ruleId">"fileRule"</property> <property name="defaultRule">true</property>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Oct 11 02:16:55 UTC 2015 - 1.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/RobotsTxtHelper.java
protected static final Pattern CRAWL_DELAY_RECORD = Pattern.compile("^crawl-delay:\\s*([^\\s]+)\\s*$", Pattern.CASE_INSENSITIVE); /** * Pattern for Sitemap record. */ protected static final Pattern SITEMAP_RECORD = Pattern.compile("^sitemap:\\s*([^\\s]+)\\s*$", Pattern.CASE_INSENSITIVE); /** Whether robots.txt processing is enabled. */ protected boolean enabled = true; /**
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 7.7K bytes - Viewed (0) -
fess-crawler/src/main/resources/org/codelibs/fess/crawler/mime/tika-mimetypes.xml
<match value="sitemap:" type="stringignorecase" offset="0"/> <match value="\nuser-agent:" type="stringignorecase" offset="0:1000"/> <match value="\nallow:" type="stringignorecase" offset="0:1000"/> <match value="\ndisallow:" type="stringignorecase" offset="0:1000"/> <match value="\nsitemap:" type="stringignorecase" offset="0:1000"/> </match>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Mar 13 08:18:01 UTC 2025 - 320.1K bytes - Viewed (1)