- Sort Score
- Result 10 results
- Languages All
Results 1 - 6 of 6 for Expression (0.04 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java
* It uses XPath expressions to extract text content from HTML documents. * <p> * This class provides methods to configure the XPath expressions, parser features, and properties. * It also includes caching mechanism for XPathAPI instances to improve performance. * </p> * <p> * The extracted text is obtained from the nodes selected by the {@code targetNodePath} XPath expression.Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java
@Override public void close() { clientFactory.close(); } /** * Adds an include filter for URLs. * Only URLs matching this regular expression will be crawled. * @param regexp The regular expression for the include filter. */ public void addIncludeFilter(final String regexp) { if (StringUtil.isNotBlank(regexp)) { urlFilter.addInclude(regexp); }Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 14K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/XpathTransformer.java
private static final Pattern SPACE_PATTERN = Pattern.compile("\\s+", Pattern.MULTILINE); /** * A map of field rules, where the key is the field name and the value is the XPath expression. */ protected Map<String, String> fieldRuleMap = new LinkedHashMap<>(); /** Flag to enable or disable trimming of whitespace characters. */ protected boolean trimSpaceEnabled = true;Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 13.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/XmlTransformer.java
} /** * Retrieves a list of XPath nodes from the document. * * @param doc The XML document. * @param xpath The XPath expression. * @return A list of XPath nodes. * @throws XPathExpressionException if an XPath expression error occurs. */ protected XPathNodes getNodeList(final Document doc, final String xpath) throws XPathExpressionException {Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 23.9K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
* <li><b>preloadSizeForCharset:</b> The number of bytes to read from the input * stream to determine the character set encoding.</li> * <li><b>invalidUrlPattern:</b> A regular expression pattern used to identify * invalid URLs.</li> * </ul> * * <p> * <b>Usage:</b> * </p> * <p> * The {@code transform} method is the main entry point for transforming an HTML
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 28.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java
} } return null; } /** * Adds a directive to the robots.txt rules. * The user-agent pattern in the directive is converted to a regular expression pattern, * where '*' is replaced with '.*' for pattern matching, and stored case-insensitively. * * @param directive The directive to add to the robots.txt rules */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10K bytes - Viewed (0)