Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 4 of 4 for selector (0.02 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java

     * It also includes caching mechanism for XPathAPI instances to improve performance.
     * </p>
     * <p>
     * The extracted text is obtained from the nodes selected by the {@code targetNodePath} XPath expression.
     * The default value for {@code targetNodePath} is "//HTML/BODY | //@alt | //@title", which selects the body of the HTML document,
     * as well as the alt and title attributes.
     * </p>
     * <p>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.3K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java

        protected class TikaDetectParser extends CompositeParser {
            private static final long serialVersionUID = 1L;
    
            /**
             * The type detector used by this parser to auto-detect the type of a
             * document.
             */
            private final Detector detector; // always set in the constructor
    
            /**
             * Creates an auto-detecting parser instance using the default Tika
             * configuration.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.7K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/ExtractorBuilder.java

     * for configuring the extraction parameters and handling the underlying complexities of content processing,
     * such as MIME type detection, extractor selection, and content length validation.
     * </p>
     *
     * <p>
     * Example usage:
     * </p>
     *
     * <pre>
     * {@code
     * try (InputStream in = new FileInputStream("example.pdf")) {
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.1K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

         * Extracts URLs from HTML tag attributes using XPath.
         *
         * @param url the base URL for resolving relative URLs
         * @param document the document to extract URLs from
         * @param xpath the XPath expression to select elements
         * @param attr the attribute name to extract URLs from
         * @param encoding the character encoding to use
         * @return a list of extracted URLs
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 28.5K bytes
    - Viewed (0)
Back to top