- Sort Score
- Result 10 results
- Languages All
Results 31 - 40 of 65 for urlset (0.02 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/UrlQueueServiceImpl.java
* This class provides methods for managing a queue of URLs to be crawled, * including adding, deleting, and retrieving URLs from the queue. * It uses a {@link MemoryDataHelper} to store the URL queue data in memory. * * <p> * The class is responsible for: * </p> * <ul> * <li>Updating session IDs for URL queues.</li> * <li>Adding new URLs to the queue.</li>Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 9.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapFile.java
* This class holds information about a single Sitemap, including its location and last modification timestamp. * It implements the {@link Sitemap} interface. * * <p> * A Sitemap file provides search engines with a list of URLs available for crawling. * This class encapsulates the essential attributes of a Sitemap entry, allowing for efficient management * and processing of Sitemap data. * </p> * * <p>Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.4K bytes - Viewed (1) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PasswordBasedExtractor.java
* <ul> * <li>Static passwords configured via {@link #addPassword(String, String)}</li> * <li>Dynamic passwords provided through extraction parameters</li> * </ul> * * <p>Passwords are matched against URLs or resource names using regular expression patterns. * The extractor first tries to match against the URL, then falls back to the resource name if available. * * @author shinsuke */
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 5.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/interval/impl/AbstractIntervalController.java
* <ul> * <li>Before processing a URL ({@link #delayBeforeProcessing()})</li> * <li>After processing a URL ({@link #delayAfterProcessing()})</li> * <li>When there are no URLs in the queue ({@link #delayAtNoUrlInQueue()})</li> * <li>While waiting for new URLs to be added to the queue ({@link #delayForWaitingNewUrl()})</li> * </ul> * * <p> * Subclasses are responsible for implementing the abstract methods to define the actual delay
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/UrlConvertHelper.java
/** * Helper class for converting URLs based on a set of predefined rules. * * <p>This class provides functionality to convert URLs by replacing parts of the URL * based on a map of target strings and their corresponding replacements. It allows * adding new conversion rules, setting the entire conversion map, and converting * URLs using these rules.</p> *
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 3.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java
import org.codelibs.fess.crawler.service.UrlQueueService; import jakarta.annotation.Resource; /** * The Crawler class is the main class for web crawling. It manages the crawling process, * including adding URLs to the queue, filtering URLs, managing crawler threads, * and handling the overall crawling lifecycle. * * <p>It implements the Runnable interface to be executed in a separate thread,
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 14K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/suggest/converter/KatakanaConverter.java
try (TokenStream stream = createTokenStream(rd)) { if (stream == null) { throw new IOException("Invalid tokenizer."); } stream.reset(); int offset = 0; while (stream.incrementToken()) { final CharTermAttribute att = stream.getAttribute(CharTermAttribute.class); final String term = att.toString();
Registered: Fri Sep 19 09:08:11 UTC 2025 - Last Modified: Fri Jul 04 14:00:23 UTC 2025 - 6.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java
public void addSitemap(final String url) { if (!sitemapList.contains(url)) { sitemapList.add(url); } } /** * Returns an array of sitemap URLs. * * @return an array of sitemap URLs */ public String[] getSitemaps() { return sitemapList.toArray(new String[sitemapList.size()]); } /** * Represents a directive in a robots.txt file.
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10K bytes - Viewed (0) -
README.md
</components> ``` ### Crawler Context Configuration ```java // Set maximum number of URLs to crawl crawler.crawlerContext.setMaxAccessCount(1000); // Set number of crawler threads crawler.crawlerContext.setNumOfThread(10); // Set maximum crawl depth crawler.crawlerContext.setMaxDepth(3);
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler-opensearch/src/main/java/org/codelibs/fess/crawler/service/impl/OpenSearchUrlQueueService.java
public QueueHolder() { // Default constructor } /** * The queue for URLs waiting to be crawled. */ protected Queue<OpenSearchUrlQueue> waitingQueue = new ConcurrentLinkedQueue<>(); /** * The queue for URLs currently being crawled. */ protected Queue<OpenSearchUrlQueue> crawlingQueue = new ConcurrentLinkedQueue<>(); }
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 17K bytes - Viewed (1)