Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 7 of 7 for filtering (0.1 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/impl/UrlFilterImpl.java

        }
    
        /**
         * Returns the include filtering pattern.
         * @return The include filtering pattern.
         */
        public String getIncludeFilteringPattern() {
            return includeFilteringPattern;
        }
    
        /**
         * Sets the include filtering pattern.
         * @param includeFilteringPattern The include filtering pattern.
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 9.2K bytes
    - Viewed (0)
  2. README.md

    - **Extensible Architecture**: Plugin system for custom extractors, transformers, and clients
    - **Rate Limiting**: Politeness policies and interval controllers
    - **URL Filtering**: Regex-based inclusion/exclusion patterns
    - **Data Persistence**: Multiple backend options including OpenSearch integration
    
    ## Technology Stack
    
    - **Java**: 21+ (requires Java 21 or higher)
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/UrlFilterServiceImpl.java

    import org.codelibs.fess.crawler.service.UrlFilterService;
    
    import jakarta.annotation.Resource;
    
    /**
     * Implementation of the {@link UrlFilterService} interface.
     * This class provides methods for managing URL filtering rules,
     * including adding include and exclude URL patterns, deleting patterns,
     * and retrieving lists of compiled URL patterns. It utilizes a
     * {@link MemoryDataHelper} to store and manage the URL patterns in memory.
     *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 4.2K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/MemoryDataHelper.java

        /** Map of session IDs to include URL patterns for filtering URLs. */
        protected volatile Map<String, List<Pattern>> includeUrlPatternMap = new HashMap<>();
    
        /** Map of session IDs to exclude URL patterns for filtering URLs. */
        protected volatile Map<String, List<Pattern>> excludeUrlPatternMap = new HashMap<>();
    
        /**
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 8.1K bytes
    - Viewed (0)
  5. src/main/java/org/codelibs/fess/suggest/request/popularwords/PopularWordsRequest.java

     *
     * <p>Key functionalities include:</p>
     * <ul>
     *   <li>Setting the target index for the search.</li>
     *   <li>Limiting the number of results (size).</li>
     *   <li>Filtering by tags, roles, fields, and languages.</li>
     *   <li>Excluding specific words from the results.</li>
     *   <li>Building the OpenSearch query and rescorer for the popular words search.</li>
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Thu Aug 07 02:41:28 UTC 2025
    - 9.2K bytes
    - Viewed (0)
  6. src/main/java/org/codelibs/fess/suggest/index/contents/document/ESSourceReader.java

     * </p>
     *
     * <p>
     * The reader supports limiting the number of documents read based on a percentage of the total documents
     * or a fixed number. It also allows filtering documents based on their size, using the {@code limitOfDocumentSize}
     * parameter.
     * </p>
     *
     * <p>
     * The reader uses a queue to buffer documents read from Elasticsearch, and it retries failed requests
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Thu Aug 07 02:41:28 UTC 2025
    - 11K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java

    import jakarta.annotation.Resource;
    
    /**
     * The Crawler class is the main class for web crawling. It manages the crawling process,
     * including adding URLs to the queue, filtering URLs, managing crawler threads,
     * and handling the overall crawling lifecycle.
     *
     * <p>It implements the Runnable interface to be executed in a separate thread,
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 14K bytes
    - Viewed (0)
Back to top