- Sort Score
- Result 10 results
- Languages All
Results 1 - 4 of 4 for selector (0.02 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java
* It also includes caching mechanism for XPathAPI instances to improve performance. * </p> * <p> * The extracted text is obtained from the nodes selected by the {@code targetNodePath} XPath expression. * The default value for {@code targetNodePath} is "//HTML/BODY | //@alt | //@title", which selects the body of the HTML document, * as well as the alt and title attributes. * </p> * <p>Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java
protected class TikaDetectParser extends CompositeParser { private static final long serialVersionUID = 1L; /** * The type detector used by this parser to auto-detect the type of a * document. */ private final Detector detector; // always set in the constructor /** * Creates an auto-detecting parser instance using the default Tika * configuration.Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 30.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/ExtractorBuilder.java
* for configuring the extraction parameters and handling the underlying complexities of content processing, * such as MIME type detection, extractor selection, and content length validation. * </p> * * <p> * Example usage: * </p> * * <pre> * {@code * try (InputStream in = new FileInputStream("example.pdf")) {
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
* Extracts URLs from HTML tag attributes using XPath. * * @param url the base URL for resolving relative URLs * @param document the document to extract URLs from * @param xpath the XPath expression to select elements * @param attr the attribute name to extract URLs from * @param encoding the character encoding to use * @return a list of extracted URLs */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 28.5K bytes - Viewed (0)