- Sort Score
- Result 10 results
- Languages All
Results 11 - 20 of 26 for Cruces (0.1 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/rule/RuleManager.java
*/ package org.codelibs.fess.crawler.rule; import org.codelibs.fess.crawler.entity.ResponseData; /** * The RuleManager interface provides methods to manage rules for processing response data. * It allows adding, retrieving, and removing rules, as well as checking for their existence. */ public interface RuleManager { /** * Retrieves the rule associated with the given response data. *Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 2.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
* <li>Extracting child URLs from the HTML content based on configured rules.</li> * <li>Handling redirect URLs specified in the response headers.</li> * </ol> * <p> * The class also provides methods for configuring features and properties of the * underlying DOM parser, as well as defining rules for extracting child URLs * from specific HTML tags and attributes. * </p> * * <p>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 28.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/util/TextUtil.java
/** * Utility class for text normalization and processing. * * This class provides methods to normalize text by reading characters from a provided Reader * and processing them according to specific rules. The main functionality is encapsulated * within the nested {@link TextNormalizeContext} class. * * <p>The text normalization process includes: * <ul>Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 12K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/rule/Rule.java
import java.io.Serializable; import org.codelibs.fess.crawler.entity.ResponseData; import org.codelibs.fess.crawler.processor.ResponseProcessor; /** * The Rule interface defines the contract for implementing rules that can be applied to * response data in a web crawler. Implementations of this interface should provide logic * to determine if a given response data matches the rule, retrieve the rule's identifier,
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 1.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/impl/UrlFilterImpl.java
/** * Implementation of the {@link UrlFilter} interface. * This class provides functionality to filter URLs based on include and exclude patterns. * It uses a {@link UrlFilterService} to manage the URL filtering rules. * The class supports caching of include and exclude patterns for scenarios where a session ID is not available. * It also provides methods to initialize the filter with a session ID, clear the filter,Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 9.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/service/impl/UrlFilterServiceImpl.java
import org.codelibs.fess.crawler.service.UrlFilterService; import jakarta.annotation.Resource; /** * Implementation of the {@link UrlFilterService} interface. * This class provides methods for managing URL filtering rules, * including adding include and exclude URL patterns, deleting patterns, * and retrieving lists of compiled URL patterns. It utilizes a * {@link MemoryDataHelper} to store and manage the URL patterns in memory. * */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/HcHttpClient.java
/** * Sets whether to use robots.txt disallow rules. * * @param useRobotsTxtDisallows True to use disallow rules, false otherwise */ public void setUseRobotsTxtDisallows(final boolean useRobotsTxtDisallows) { this.useRobotsTxtDisallows = useRobotsTxtDisallows; } /** * Sets whether to use robots.txt allow rules. *Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 52.2K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/transformer/TransformerTest.java
return null; } ResultData resultData = new ResultData(); resultData.setTransformerName(name); // Apply transformation rules try (InputStream is = responseData.getResponseBody()) { byte[] bytes = is.readAllBytes(); String content = new String(bytes);Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 28K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/XpathTransformer.java
import org.xml.sax.InputSource; /** * {@link XpathTransformer} is a class that transforms HTML content into XML format based on XPath expressions. * It extracts data from an HTML document by applying XPath rules defined in {@link #fieldRuleMap}. * The extracted data is then formatted into an XML structure and stored in the {@link ResultData}. * <p>Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 13.1K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/exception/CrawlerSystemExceptionTest.java
assertNotNull(mainStackTrace); assertNotNull(causeStackTrace); assertTrue(mainStackTrace.length > 0); assertTrue(causeStackTrace.length > 0); // Stack traces should be different assertNotSame(mainStackTrace, causeStackTrace); } /** * Test printStackTrace functionality */ public void test_printStackTrace() {Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Wed Sep 03 14:42:53 UTC 2025 - 20K bytes - Viewed (0)