- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 335 for crawler (0.03 sec)
-
fess-crawler-lasta/src/main/resources/crawler.xml
<components namespace="fessCrawler"> <include path="crawler/container.xml"/> <include path="crawler/client.xml"/> <include path="crawler/rule.xml"/> <include path="crawler/filter.xml"/> <include path="crawler/interval.xml"/> <include path="crawler/extractor.xml"/> <include path="crawler/mimetype.xml"/> <include path="crawler/encoding.xml"/> <include path="crawler/urlconverter.xml"/> <include path="crawler/log.xml"/>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Tue Nov 28 13:40:25 UTC 2017 - 1.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java
* * <p>Example usage: * <pre> * Crawler crawler = new Crawler(); * crawler.addUrl("http://example.com/"); * crawler.execute(); * crawler.close(); * </pre> */ public class Crawler implements Runnable, AutoCloseable { private static final Logger logger = LogManager.getLogger(Crawler.class); /** * Service for managing URL queues during crawling. */
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 14K bytes - Viewed (0) -
README.md
```java // Create multiple crawler instances Crawler crawler1 = container.getComponent("crawler"); crawler1.setSessionId("session1"); crawler1.addUrl("https://site1.com"); Crawler crawler2 = container.getComponent("crawler"); crawler2.setSessionId("session2"); crawler2.addUrl("https://site2.com"); // Execute concurrently crawler1.setBackground(true); crawler2.setBackground(true);
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler-lasta/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java
crawler1.addUrl(url1); crawler1.getCrawlerContext().setMaxAccessCount(maxCount); crawler1.getCrawlerContext().setNumOfThread(numOfThread); final Crawler crawler2 = crawlerContainer.getComponent("crawler"); crawler2.setSessionId(crawler2.getSessionId() + "2"); crawler2.setBackground(true);
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 12.8K bytes - Viewed (0) -
fess-crawler-opensearch/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java
crawler1.getCrawlerContext().setMaxAccessCount(maxCount); crawler1.getCrawlerContext().setNumOfThread(numOfThread); final Crawler crawler2 = getComponent(Crawler.class); crawler2.setBackground(true); ((UrlFilterImpl) crawler2.urlFilter).setIncludeFilteringPattern("$1$2$3.*"); crawler2.addUrl(url2); crawler2.getCrawlerContext().setMaxAccessCount(maxCount);
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 7.7K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java
crawler1.addUrl(url1); crawler1.getCrawlerContext().setMaxAccessCount(maxCount); crawler1.getCrawlerContext().setNumOfThread(numOfThread); final Crawler crawler2 = container.getComponent("crawler"); crawler2.setSessionId(crawler2.getSessionId() + "2"); crawler2.setBackground(true);
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 19.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/CrawlerThread.java
import org.codelibs.fess.crawler.client.CrawlerClientFactory; import org.codelibs.fess.crawler.container.CrawlerContainer; import org.codelibs.fess.crawler.entity.AccessResult; import org.codelibs.fess.crawler.entity.RequestData; import org.codelibs.fess.crawler.entity.ResponseData; import org.codelibs.fess.crawler.entity.UrlQueue; import org.codelibs.fess.crawler.exception.ChildUrlsException;
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 20.4K bytes - Viewed (0) -
fess-crawler/src/test/resources/org/codelibs/fess/crawler/helper/robots.txt
User-agent: BruteBot Disallow: / Allow: /foo/bar/ Crawl-delay: 1314000 # welcome! User-agent: Googlebot Crawl-delay: 1 User-agent: * Disallow: /private/ Disallow: /help # disallows /help.html, /help/index.html, etc. Allow: /help/faq.html Crawl-delay: 3 User-agent: Crawler Disallow: /aaa User-agent: Crawler/1.0 Disallow: /bbb User-agent: Crawler/2.0 Disallow: /ccc User-agent: Hoge Crawler
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Oct 11 02:16:55 UTC 2015 - 566 bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/AccessResultImpl.java
* * @see org.codelibs.fess.crawler.entity.AccessResult#getAccessResultDataAsOne() */ @Override public AccessResultData<IDTYPE> getAccessResultData() { return accessResultData; } /* * (non-Javadoc) * * @see * org.codelibs.fess.crawler.entity.AccessResult#setAccessResultDataAsOne(org.codelibs.fess.crawler.db.exentity.AccessResultData) */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 9K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/RobotsTxtHelperTest.java
assertFalse(robotsTxt.allows("/aaa", "Crawler")); assertTrue(robotsTxt.allows("/bbb", "Crawler")); assertTrue(robotsTxt.allows("/ccc", "Crawler")); assertTrue(robotsTxt.allows("/ddd", "Crawler")); assertTrue(robotsTxt.allows("/aaa", "Crawler/1.0")); assertFalse(robotsTxt.allows("/bbb", "Crawler/1.0")); assertTrue(robotsTxt.allows("/ccc", "Crawler/1.0"));Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 5.9K bytes - Viewed (0)