crawler_ - Code Search

fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapFile.java

     * Datetime format.
     *
     * By providing the last modification timestamp, you enable search engine
     * crawlers to retrieve only a subset of the Sitemaps in the index i.e. a
     * crawler may only retrieve Sitemaps that were modified since a certain
     * date. This incremental Sitemap fetching mechanism allows for the rapid
     * discovery of new URLs on very large sites.

Registered: Sun Nov 10 03:50:12 UTC 2024

- Last Modified: Thu Feb 22 01:36:27 UTC 2024

- 2.7K bytes

- Viewed (0)

github.com/codelibs/fess-crawler

fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java

     * Please note that the value of this tag is considered a hint and not a
     * command. Even though search engine crawlers may consider this information
     * when making decisions, they may crawl pages marked "hourly" less
     * frequently than that, and they may crawl pages marked "yearly" more
     * frequently than that. Crawlers may periodically crawl pages marked
     * "never" so that they can handle unexpected changes to those pages.
     */

Registered: Sun Nov 10 03:50:12 UTC 2024

- Last Modified: Thu Feb 22 01:36:27 UTC 2024

- 4.9K bytes

- Viewed (0)

github.com/codelibs/fess-crawler

fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/UrlFilter.java

    /**
     * Add an url pattern as a target.
     *
     * @param urlPattern Regular expression that is crawled
     */
    void addInclude(String urlPattern);

    /**
     * Add an url pattern as a non-target.
     *
     * @param urlPattern Regular expression that is not crawled
     */
    void addExclude(String urlPattern);

    /**
     * Process an url when it's added as a seed url.
     *

Registered: Sun Nov 10 03:50:12 UTC 2024

- Last Modified: Thu Feb 22 01:36:27 UTC 2024

- 1.6K bytes

- Viewed (0)

github.com/codelibs/fess

docs/de/README.md

* [LastaFlute](https://github.com/lastaflute/lastaflute "LastaFlute"): Web-Framework
* [Lasta Job](https://github.com/lastaflute/lasta-job "Lasta Job"): Job-Scheduler
* [Fess Crawler](https://github.com/codelibs/fess-crawler "Fess Crawler"): Web-Crawler

Registered: Thu Oct 31 13:40:30 UTC 2024

- Last Modified: Sat Oct 12 07:19:47 UTC 2024

- 7.6K bytes

- Viewed (0)

github.com/codelibs/fess

docs/fr/README.md

* [LastaFlute](https://github.com/lastaflute/lastaflute "LastaFlute") : Framework Web
* [Lasta Job](https://github.com/lastaflute/lasta-job "Lasta Job") : Planificateur de tâches
* [Fess Crawler](https://github.com/codelibs/fess-crawler "Fess Crawler") : Crawler Web

Registered: Thu Oct 31 13:40:30 UTC 2024

- Last Modified: Sat Oct 12 07:19:47 UTC 2024

- 7.9K bytes

- Viewed (0)

github.com/codelibs/fess

src/main/webapp/WEB-INF/env/crawler/resources/app.xml

			</arg>
		</postConstruct>
		 -->
	</component>
	<component name="crawlerStatsHelper"
		class="org.codelibs.fess.helper.CrawlerStatsHelper">
	</component>
	<component name="fessCrawler" class="org.codelibs.fess.exec.Crawler"
		instance="prototype">
	</component>

Registered: Thu Oct 31 13:40:30 UTC 2024

- Last Modified: Sat Apr 09 02:14:47 UTC 2022

- 1.8K bytes

- Viewed (0)

github.com/codelibs/fess

src/main/resources/fess_label_de.properties

labels.crawling_info_delete_all_cancel=Abbrechen
labels.crawling_info_thread_dump=Thread-Dump
labels.crawling_info_CrawlerStartTime=Crawler Startzeit
labels.crawling_info_CrawlerEndTime=Crawler Endzeit
labels.crawling_info_CrawlerExecTime=Crawler Ausführungsdauer
labels.crawling_info_CrawlerStatus=Crawler-Status
labels.crawling_info_WebFsCrawlExecTime=Crawl Ausführungsdauer (Web/Dateisystem)

Registered: Thu Oct 31 13:40:30 UTC 2024

- Last Modified: Fri Mar 22 11:58:34 UTC 2024

- 42.8K bytes

- Viewed (0)

github.com/codelibs/fess-crawler

fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/DataServiceImpl.java

 */
package org.codelibs.fess.crawler.service.impl;

import java.util.List;
import java.util.Map;

import org.codelibs.fess.crawler.Constants;
import org.codelibs.fess.crawler.entity.AccessResultData;
import org.codelibs.fess.crawler.entity.AccessResultDataImpl;
import org.codelibs.fess.crawler.entity.AccessResultImpl;
import org.codelibs.fess.crawler.exception.CrawlerSystemException;

Registered: Sun Nov 10 03:50:12 UTC 2024

- Last Modified: Thu Feb 22 01:47:32 UTC 2024

- 5.5K bytes

- Viewed (0)

github.com/codelibs/fess

src/test/resources/plugin/repo3/index.html

<a href="fess-crawler-es/" title="fess-crawler-es/">fess-crawler-es/</a>                                                 -         -      
<a href="fess-crawler-lasta/" title="fess-crawler-lasta/">fess-crawler-lasta/</a>                                              -         -

Registered: Thu Oct 31 13:40:30 UTC 2024

- Last Modified: Mon Jun 17 13:30:41 UTC 2024

- 6.2K bytes

- Viewed (0)

github.com/codelibs/fess-crawler

fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/UrlQueueServiceImpl.java

 */
package org.codelibs.fess.crawler.service.impl;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Queue;

import org.codelibs.core.lang.StringUtil;
import org.codelibs.core.lang.SystemUtil;
import org.codelibs.fess.crawler.Constants;
import org.codelibs.fess.crawler.entity.AccessResult;
import org.codelibs.fess.crawler.entity.AccessResultImpl;

Registered: Sun Nov 10 03:50:12 UTC 2024

- Last Modified: Thu Feb 22 01:47:32 UTC 2024

- 7.5K bytes

- Viewed (0)

Search Options