- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 282 for crawlers (0.09 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapFile.java
* Datetime format. * * By providing the last modification timestamp, you enable search engine * crawlers to retrieve only a subset of the Sitemaps in the index i.e. a * crawler may only retrieve Sitemaps that were modified since a certain * date. This incremental Sitemap fetching mechanism allows for the rapid * discovery of new URLs on very large sites.Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.4K bytes - Viewed (1) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java
* Please note that the value of this tag is considered a hint and not a * command. Even though search engine crawlers may consider this information * when making decisions, they may crawl pages marked "hourly" less * frequently than that, and they may crawl pages marked "yearly" more * frequently than that. Crawlers may periodically crawl pages marked * "never" so that they can handle unexpected changes to those pages. */Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Nov 13 13:34:36 UTC 2025 - 9.1K bytes - Viewed (0) -
docs/fr/README.md
* [LastaFlute](https://github.com/lastaflute/lastaflute "LastaFlute") : Framework Web * [Lasta Job](https://github.com/lastaflute/lasta-job "Lasta Job") : Planificateur de tâches * [Fess Crawler](https://github.com/codelibs/fess-crawler "Fess Crawler") : Crawler Web
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Tue Nov 11 22:42:32 UTC 2025 - 7.9K bytes - Viewed (0) -
fess-crawler-lasta/src/main/resources/crawler.xml
<components namespace="fessCrawler"> <include path="crawler/container.xml"/> <include path="crawler/client.xml"/> <include path="crawler/rule.xml"/> <include path="crawler/filter.xml"/> <include path="crawler/interval.xml"/> <include path="crawler/extractor.xml"/> <include path="crawler/mimetype.xml"/> <include path="crawler/encoding.xml"/> <include path="crawler/urlconverter.xml"/> <include path="crawler/log.xml"/>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Tue Nov 28 13:40:25 UTC 2017 - 1.7K bytes - Viewed (0) -
src/main/resources/mail/crawler.dfmail
/* [Crawler Notification] Crawler notification mail. */ subject: [FESS] Crawler completed: /*pmb.hostname*/ >>> --- Server Info --- Host Name: /*pmb.hostname:orElse('Unknown')*/ Job Name: /*pmb.jobname:orElse('Unknown')*/ --- Web/FileSystem Crawler --- Start Time: /*pmb.webFsCrawlStartTime:orElse('-')*/ End Time: /*pmb.webFsCrawlEndTime:orElse('-')*/ Exec Time: /*pmb.webFsCrawlExecTime:orElse('-')*/ ms --- Web/FileSystem Indexer ---Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Wed Jan 15 22:05:20 UTC 2020 - 1K bytes - Viewed (0) -
fess-crawler-opensearch/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java
crawler1.getCrawlerContext().setMaxAccessCount(maxCount); crawler1.getCrawlerContext().setNumOfThread(numOfThread); final Crawler crawler2 = getComponent(Crawler.class); crawler2.setBackground(true); ((UrlFilterImpl) crawler2.urlFilter).setIncludeFilteringPattern("$1$2$3.*"); crawler2.addUrl(url2); crawler2.getCrawlerContext().setMaxAccessCount(maxCount);
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 7.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/CrawlerContext.java
* It contains various attributes related to the crawler's state, configuration, and runtime data. * This class provides methods to access and modify these attributes, allowing for control and monitoring * of the crawler's behavior. * * <p> * The context includes information such as the session ID, active thread count, access count, crawler status,
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 8.9K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/crawler/processor/FessResponseProcessor.java
*/ package org.codelibs.fess.crawler.processor; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.codelibs.fess.crawler.entity.AccessResult; import org.codelibs.fess.crawler.entity.ResponseData; import org.codelibs.fess.crawler.entity.ResultData; import org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor;
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Jul 17 08:28:31 UTC 2025 - 3.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/exception/ExtractException.java
* governing permissions and limitations under the License. */ package org.codelibs.fess.crawler.exception; /** * Exception thrown during the extraction process in the crawler. * This exception indicates a failure or error that occurred while extracting content from a crawled resource. * It extends {@link org.codelibs.fess.crawler.exception.CrawlerSystemException} and provides constructorsRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/UrlFilter.java
/** * Add an url pattern as a target. * * @param urlPattern Regular expression that is crawled */ void addInclude(String urlPattern); /** * Add an url pattern as a non-target. * * @param urlPattern Regular expression that is not crawled */ void addExclude(String urlPattern); /** * Process an url when it's added as a seed url. *Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 1.6K bytes - Viewed (0)