- Sort Score
- Result 10 results
- Languages All
Results 1 - 7 of 7 for filtering (0.1 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/impl/UrlFilterImpl.java
} /** * Returns the include filtering pattern. * @return The include filtering pattern. */ public String getIncludeFilteringPattern() { return includeFilteringPattern; } /** * Sets the include filtering pattern. * @param includeFilteringPattern The include filtering pattern. */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 9.2K bytes - Viewed (0) -
README.md
- **Extensible Architecture**: Plugin system for custom extractors, transformers, and clients - **Rate Limiting**: Politeness policies and interval controllers - **URL Filtering**: Regex-based inclusion/exclusion patterns - **Data Persistence**: Multiple backend options including OpenSearch integration ## Technology Stack - **Java**: 21+ (requires Java 21 or higher)
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/UrlFilterServiceImpl.java
import org.codelibs.fess.crawler.service.UrlFilterService; import jakarta.annotation.Resource; /** * Implementation of the {@link UrlFilterService} interface. * This class provides methods for managing URL filtering rules, * including adding include and exclude URL patterns, deleting patterns, * and retrieving lists of compiled URL patterns. It utilizes a * {@link MemoryDataHelper} to store and manage the URL patterns in memory. *Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/MemoryDataHelper.java
/** Map of session IDs to include URL patterns for filtering URLs. */ protected volatile Map<String, List<Pattern>> includeUrlPatternMap = new HashMap<>(); /** Map of session IDs to exclude URL patterns for filtering URLs. */ protected volatile Map<String, List<Pattern>> excludeUrlPatternMap = new HashMap<>(); /**Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 8.1K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/suggest/request/popularwords/PopularWordsRequest.java
* * <p>Key functionalities include:</p> * <ul> * <li>Setting the target index for the search.</li> * <li>Limiting the number of results (size).</li> * <li>Filtering by tags, roles, fields, and languages.</li> * <li>Excluding specific words from the results.</li> * <li>Building the OpenSearch query and rescorer for the popular words search.</li>
Registered: Fri Sep 19 09:08:11 UTC 2025 - Last Modified: Thu Aug 07 02:41:28 UTC 2025 - 9.2K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/suggest/index/contents/document/ESSourceReader.java
* </p> * * <p> * The reader supports limiting the number of documents read based on a percentage of the total documents * or a fixed number. It also allows filtering documents based on their size, using the {@code limitOfDocumentSize} * parameter. * </p> * * <p> * The reader uses a queue to buffer documents read from Elasticsearch, and it retries failed requestsRegistered: Fri Sep 19 09:08:11 UTC 2025 - Last Modified: Thu Aug 07 02:41:28 UTC 2025 - 11K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java
import jakarta.annotation.Resource; /** * The Crawler class is the main class for web crawling. It manages the crawling process, * including adding URLs to the queue, filtering URLs, managing crawler threads, * and handling the overall crawling lifecycle. * * <p>It implements the Runnable interface to be executed in a separate thread,
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 14K bytes - Viewed (0)