Search Options

Results per page
Sort
Preferred Languages
Advance

Results 11 - 15 of 15 for patternset (0.03 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java

     * the most specific (longest) match is used.</p>
     *
     */
    public class RobotsTxt {
        private static final String ALL_BOTS = "*";
    
        /** Map of user agent patterns to their corresponding directives. */
        protected final Map<Pattern, Directive> directiveMap = new LinkedHashMap<>();
    
        /** List of sitemap URLs found in the robots.txt file. */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/HcHttpClient.java

            this.cookieSpecRegistry = cookieSpecRegistry;
        }
    
        /**
         * Sets the cookie date patterns for parsing.
         *
         * @param cookieDatePatterns The cookie date patterns
         */
        public void setCookieDatePatterns(final String[] cookieDatePatterns) {
            this.cookieDatePatterns = cookieDatePatterns;
        }
    
        /**
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 52.2K bytes
    - Viewed (0)
  3. fess-crawler/src/test/java/org/codelibs/fess/crawler/rule/impl/AbstractRuleTest.java

            ConditionalAbstractRule conditionalRule = new ConditionalAbstractRule();
            conditionalRule.crawlerContainer = container;
            conditionalRule.setRuleId("conditionalRule");
    
            // Set patterns
            conditionalRule.setUrlPattern("https?://.*\\.example\\.com/.*");
            conditionalRule.setMimeTypePattern("text/.*");
    
            // Test matching
            ResponseData responseData1 = new ResponseData();
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Wed Sep 03 14:42:53 UTC 2025
    - 21.9K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/container/StandardCrawlerContainer.java

    /**
     * A container implementation that manages the lifecycle and dependency injection of components
     * in a crawler application. This container supports both singleton and prototype component
     * instantiation patterns.
     *
     * <p>The container provides mechanisms for:
     * <ul>
     *   <li>Registering and retrieving components by name</li>
     *   <li>Managing singleton instances with lifecycle hooks</li>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 14.3K bytes
    - Viewed (0)
  5. fess-crawler/src/main/resources/org/codelibs/fess/crawler/mime/tika-mimetypes.xml

      sources like Apache Nutch, Apache HTTP Server, the file(1) command, etc.
    
      Notes:
       * Tika supports a wider range of match types than Freedesktop does
       * Glob patterns must be unique, if there's a clash assign to the most
         popular format
       * The main mime type should be the canonical one, use aliases for any
         other widely used forms
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Mar 13 08:18:01 UTC 2025
    - 320.1K bytes
    - Viewed (1)
Back to top