Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 4 of 4 for Regular (0.02 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java

         */
        @Override
        public void close() {
            clientFactory.close();
        }
    
        /**
         * Adds an include filter for URLs.
         * Only URLs matching this regular expression will be crawled.
         * @param regexp The regular expression for the include filter.
         */
        public void addIncludeFilter(final String regexp) {
            if (StringUtil.isNotBlank(regexp)) {
                urlFilter.addInclude(regexp);
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 14K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java

     * </p>
     * <p>
     * The encoding of the HTML document is automatically detected using a regular expression that matches the charset attribute in the meta tag.
     * </p>
     *
     */
    public class HtmlXpathExtractor extends AbstractXmlExtractor {
        /**
         * Regular expression pattern to match the charset attribute in the meta tag of HTML documents.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.3K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java

                }
            }
            return null;
        }
    
        /**
         * Adds a directive to the robots.txt rules.
         * The user-agent pattern in the directive is converted to a regular expression pattern,
         * where '*' is replaced with '.*' for pattern matching, and stored case-insensitively.
         *
         * @param directive The directive to add to the robots.txt rules
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

     *       specified in the HTML content.</li>
     *   <li><b>preloadSizeForCharset:</b> The number of bytes to read from the input
     *       stream to determine the character set encoding.</li>
     *   <li><b>invalidUrlPattern:</b> A regular expression pattern used to identify
     *       invalid URLs.</li>
     * </ul>
     *
     * <p>
     * <b>Usage:</b>
     * </p>
     * <p>
     * The {@code transform} method is the main entry point for transforming an HTML
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 28.5K bytes
    - Viewed (0)
Back to top