Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 38 for are (0.01 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/interval/IntervalController.java

     * </p>
     * <ul>
     *   <li>{@code PRE_PROCESSING} - Represents the pre-processing state.</li>
     *   <li>{@code POST_PROCESSING} - Represents the post-processing state.</li>
     *   <li>{@code NO_URL_IN_QUEUE} - Indicates that there are no URLs in the queue.</li>
     *   <li>{@code WAIT_NEW_URL} - Indicates that the crawler is waiting for new URLs.</li>
     * </ul>
     */
    public interface IntervalController {
        /** Constant representing the pre-processing state. */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 1.8K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/interval/impl/DefaultIntervalController.java

     * This class provides a default way to manage delays between crawler operations.
     * It allows setting delays before processing, after processing, when no URLs are in the queue,
     * and when waiting for new URLs.
     * The delays are configurable via constructor parameters.
     *
     */
    public class DefaultIntervalController extends AbstractIntervalController {
    
        /** Delay in milliseconds after processing a URL */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 3.4K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/processor/ResponseProcessor.java

    package org.codelibs.fess.crawler.processor;
    
    import org.codelibs.fess.crawler.entity.ResponseData;
    
    /**
     * The ResponseProcessor interface defines a contract for processing response data.
     * Implementations of this interface are responsible for handling the response data
     * obtained during a crawling process.
     */
    public interface ResponseProcessor {
    
        /**
         * Processes the given response data.
         *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Mar 15 06:52:00 UTC 2025
    - 1.1K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/processor/impl/SitemapsResponseProcessor.java

     * It then iterates through the sitemaps in the SitemapSet, extracts the URL
     * from each sitemap, and creates a new {@link RequestData} object for each URL.
     * These RequestData objects are added to a set of child URLs, which are then
     * passed to a {@link ChildUrlsException} to be processed by the crawler.
     * </p>
     *
     * <p>
     * The class also handles potential {@link IOException}s that may occur during
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 3.4K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/interval/impl/AbstractIntervalController.java

     *   <li>After processing a URL ({@link #delayAfterProcessing()})</li>
     *   <li>When there are no URLs in the queue ({@link #delayAtNoUrlInQueue()})</li>
     *   <li>While waiting for new URLs to be added to the queue ({@link #delayForWaitingNewUrl()})</li>
     * </ul>
     *
     * <p>
     * Subclasses are responsible for implementing the abstract methods to define the actual delay
     * mechanism for each of these stages.
     * </p>
     *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 4.5K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/XPathAPI.java

                throw new CrawlerSystemException("Failed to create XPath instance.", e);
            }
        }
    
        /**
         *  Use an XPath string to select a nodelist.
         *  XPath namespace prefixes are resolved from the contextNode.
         *
         *  @param contextNode The node to start searching from.
         *  @param expression A valid XPath string.
         *  @return A XPathNodes, should never be null.
         *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 4.6K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java

         * they crawl the page. Valid values are:
         * <ul>
         * <li>always</li>
         * <li>hourly</li>
         * <li>daily</li>
         * <li>weekly</li>
         * <li>monthly</li>
         * <li>yearly</li>
         * <li>never</li>
         * </ul>
         * The value "always" should be used to describe documents that change each
         * time they are accessed. The value "never" should be used to describe
         * archived URLs.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 6.5K bytes
    - Viewed (0)
  8. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/ExtractorFactory.java

     * It also includes a builder for creating extractors.
     *
     * <p>
     * The factory maintains a map of keys to an array of {@link Extractor} objects.
     * When multiple extractors are associated with a single key, they are sorted by weight
     * in descending order. The {@link #getExtractor(String)} method returns a composite
     * extractor that iterates through the available extractors until one successfully
     * extracts the data.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 7.3K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java

        public void setInitialBufferSize(final int initialBufferSize) {
            this.initialBufferSize = initialBufferSize;
        }
    
        /**
         * Sets whether duplicated terms are replaced.
         * @param replaceDuplication If true, duplicated terms are replaced.
         */
        public void setReplaceDuplication(final boolean replaceDuplication) {
            this.replaceDuplication = replaceDuplication;
        }
    
        /**
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.7K bytes
    - Viewed (0)
  10. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/TextUtil.java

             * - ISO control characters and space characters are treated as spaces.
             * - Alphanumeric characters (0-9, A-Z, a-z) are appended to the buffer.
             * - Symbol characters (!-/, :-@, [-`, {-~) are appended to the buffer.
             * - Duplicate terms can be removed based on the `duplicateTermRemoved` flag.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 12K bytes
    - Viewed (0)
Back to top