Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 45 for Directives (0.1 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java

         */
        public int getCrawlDelay(final String userAgent) {
            final Directive directive = getMatchedDirective(userAgent);
            if (directive == null) {
                return 0;
            }
            return directive.getCrawlDelay();
        }
    
        /**
         * Returns the most specific directive matching the given user agent.
         * The method finds the longest matching user agent pattern in the directives,
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 18.5K bytes
    - Viewed (0)
  2. okhttp/src/commonJvmAndroid/kotlin/okhttp3/CacheControl.kt

    import okhttp3.internal.commonNoTransform
    import okhttp3.internal.commonOnlyIfCached
    import okhttp3.internal.commonParse
    import okhttp3.internal.commonToString
    
    /**
     * A Cache-Control header with cache directives from a server or client. These directives set policy
     * on what responses can be stored, and which requests can be satisfied by those stored responses.
     *
     * See [RFC 7234, 5.2](https://tools.ietf.org/html/rfc7234#section-5.2).
     */
    Registered: Fri Dec 26 11:42:13 UTC 2025
    - Last Modified: Fri Dec 27 13:39:56 UTC 2024
    - 10K bytes
    - Viewed (0)
  3. fess-crawler/src/test/resources/org/codelibs/fess/crawler/helper/robots_malformed.txt

    # Test file for malformed robots.txt parsing
    # This file contains various malformed directives that should be handled gracefully
    
    # Case 1: Directives before any User-agent (should be ignored)
    Disallow: /orphaned1/
    Allow: /orphaned2/
    
    # Case 2: Valid user-agent with various malformed directives
    User-agent: GoodBot
    Disallow: /admin/
    InvalidDirective: some-value
    unknown-field: test
    Disallow /missing-colon
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 14 12:52:01 UTC 2025
    - 2.6K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/RobotsTxtHelper.java

                                for (final Directive directive : currentDirectiveList) {
                                    directive.addAllow(value);
                                }
                            }
                            continue;
                        }
    
                        // Try to parse as Crawl-delay directive
                        value = getValue(CRAWL_DELAY_RECORD, line);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 14 12:52:01 UTC 2025
    - 11.4K bytes
    - Viewed (0)
  5. fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/RobotsTxtHelperTest.java

            assertNotNull(robotsTxt);
    
            // Test that orphaned directives (before any User-agent) are ignored
            // These should not affect any bot
            assertTrue(robotsTxt.allows("/orphaned1/", "AnyBot"));
            assertTrue(robotsTxt.allows("/orphaned2/", "AnyBot"));
    
            // Test GoodBot - should parse valid directives and ignore invalid ones
            assertNotNull(robotsTxt.getDirective("goodbot"));
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 20.6K bytes
    - Viewed (0)
  6. doc/asm.html

    own source code, making it easier to move the code from one location to another.
    </p>
    
    <h3 id="directives">Directives</h3>
    
    <p>
    The assembler uses various directives to bind text and data to symbol names.
    For example, here is a simple complete function definition. The <code>TEXT</code>
    directive declares the symbol <code>runtime·profileloop</code> and the instructions
    that follow form the body of the function.
    Registered: Tue Dec 30 11:13:12 UTC 2025
    - Last Modified: Fri Nov 14 19:09:46 UTC 2025
    - 36.5K bytes
    - Viewed (0)
  7. doc/godebug.md

    Only the work module's `go.mod` is consulted for `godebug` directives.
    Any directives in required dependency modules are ignored.
    It is an error to list a `godebug` with an unrecognized setting.
    (Toolchains older than Go 1.23 reject all `godebug` lines, since they do not
    understand `godebug` at all.) When a workspace is in use, `godebug`
    directives in `go.mod` files are ignored, and `go.work` will be consulted
    Registered: Tue Dec 30 11:13:12 UTC 2025
    - Last Modified: Wed Dec 03 00:18:09 UTC 2025
    - 24.7K bytes
    - Viewed (0)
  8. fess-crawler/src/test/java/org/codelibs/fess/crawler/entity/RobotsTxtTest.java

            Directive directive = new Directive("MyBot");
    
            assertNotNull(directive);
            assertEquals("MyBot", directive.getUserAgent());
            assertEquals(0, directive.getCrawlDelay());
        }
    
        public void test_directiveCrawlDelay() {
            // Test Directive crawl delay
            Directive directive = new Directive("MyBot");
    
            directive.setCrawlDelay(10);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Nov 13 13:29:22 UTC 2025
    - 14.4K bytes
    - Viewed (0)
  9. okhttp/src/commonJvmAndroid/kotlin/okhttp3/Request.kt

       * key.
       */
      fun <T> tag(type: Class<out T>): T? = tag(type.kotlin)
    
      fun newBuilder(): Builder = Builder(this)
    
      /**
       * Returns the cache control directives for this response. This is never null, even if this
       * response contains no `Cache-Control` header.
       */
      @get:JvmName("cacheControl")
      val cacheControl: CacheControl
        get() {
    Registered: Fri Dec 26 11:42:13 UTC 2025
    - Last Modified: Thu Oct 30 13:46:58 UTC 2025
    - 14.7K bytes
    - Viewed (1)
  10. src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java

            }
        }
    
        /**
         * Processes robots meta tags in the HTML document.
         * Handles noindex, nofollow, and none directives.
         *
         * @param responseData the response data from crawling
         * @param resultData the result data to store processed information
         * @param document the parsed HTML document
         */
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Fri Dec 12 13:58:40 UTC 2025
    - 54.6K bytes
    - Viewed (0)
Back to top