Search Options

Results per page
Sort
Preferred Languages
Advance

Results 21 - 30 of 69 for include (0.03 sec)

  1. fess-crawler/src/test/java/org/codelibs/fess/crawler/filter/UrlFilterTest.java

        }
    
        /**
         * Test combination of include and exclude patterns
         */
        public void test_match_includeAndExclude() {
            String sessionId = "test-session-008";
            urlFilter.init(sessionId);
    
            // Include only example.com domain
            urlFilter.addInclude("https://example.com/.*");
            // But exclude images and admin section
            urlFilter.addExclude(".*\\.(jpg|png|gif)$");
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Wed Sep 03 14:42:53 UTC 2025
    - 19K bytes
    - Viewed (0)
  2. fess-crawler-lasta/src/main/resources/crawler/encoding.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE components PUBLIC "-//DBFLUTE//DTD LastaDi 1.0//EN"
    	"http://dbflute.org/meta/lastadi10.dtd">
    <components namespace="fessCrawler">
    	<include path="crawler/container.xml" />
    
    	<component name="encodingHelper" class="org.codelibs.fess.crawler.helper.EncodingHelper">
    		<postConstruct name="addEncodingMapping">
    			<arg>"unicode"</arg>
    			<arg>"UTF-16LE"</arg>
    		</postConstruct>
    	</component>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Fri Jun 16 13:35:06 UTC 2017
    - 454 bytes
    - Viewed (0)
  3. fess-crawler-lasta/src/main/resources/crawler/robotstxt.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE components PUBLIC "-//DBFLUTE//DTD LastaDi 1.0//EN"
    	"http://dbflute.org/meta/lastadi10.dtd">
    <components namespace="fessCrawler">
    	<include path="crawler/container.xml" />
    
    	<component name="robotsTxtHelper" class="org.codelibs.fess.crawler.helper.RobotsTxtHelper"
    		instance="prototype">
    	</component>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Oct 11 02:16:55 UTC 2015
    - 367 bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/LogHelper.java

     */
    public interface LogHelper {
    
        /**
         * Logs a message with the specified log type and additional objects.
         *
         * @param key  the type of log message
         * @param objs additional objects to include in the log message
         */
        void log(LogType key, Object... objs);
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Mar 15 06:52:00 UTC 2025
    - 1.1K bytes
    - Viewed (0)
  5. LICENSE

              excluding those notices that do not pertain to any part of
              the Derivative Works; and
    
          (d) If the Work includes a "NOTICE" text file as part of its
              distribution, then any Derivative Works that You distribute must
              include a readable copy of the attribution notices contained
              within such NOTICE file, excluding those notices that do not
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Mon Jan 11 04:30:09 UTC 2021
    - 11.1K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/CharUtil.java

         */
        private CharUtil() {
        }
    
        /**
         * Checks if the given character is a valid URL character.
         *
         * Valid URL characters include:
         * - Lowercase letters (a-z)
         * - Uppercase letters (A-Z)
         * - Digits (0-9)
         * - Special characters: . - * _ : / + % = &amp; ? # [ ] @ ~ ! $ ' ( ) , ;
         *
         * @param c the character to check
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 2.3K bytes
    - Viewed (1)
  7. LICENSE

              excluding those notices that do not pertain to any part of
              the Derivative Works; and
    
          (d) If the Work includes a "NOTICE" text file as part of its
              distribution, then any Derivative Works that You distribute must
              include a readable copy of the attribution notices contained
              within such NOTICE file, excluding those notices that do not
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Mon Jan 11 04:26:17 UTC 2021
    - 11.1K bytes
    - Viewed (0)
  8. fess-crawler-opensearch/src/main/java/org/codelibs/fess/crawler/entity/OpenSearchUrlFilter.java

         */
        private String id;
    
        /**
         * The session ID associated with this URL filter.
         */
        private String sessionId;
    
        /**
         * The type of filter (e.g., include, exclude).
         */
        private String filterType;
    
        /**
         * The URL pattern for this filter.
         */
        private String url;
    
        /**
         * Returns the ID.
         * @return The ID.
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 3.6K bytes
    - Viewed (0)
  9. fess-crawler-lasta/src/main/resources/crawler/transformer_basic.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE components PUBLIC "-//DBFLUTE//DTD LastaDi 1.0//EN"
    	"http://dbflute.org/meta/lastadi10.dtd">
    <components namespace="fessCrawler">
    	<include path="crawler/container.xml" />
    
    	<component name="binaryTransformer"
    		class="org.codelibs.fess.crawler.transformer.impl.BinaryTransformer"
    		instance="singleton">
    		<property name="name">"binaryTransformer"</property>
    	</component>
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Sep 30 21:21:24 UTC 2018
    - 3.3K bytes
    - Viewed (0)
  10. README.md

    // Set request interval (politeness)
    crawler.crawlerContext.setDefaultIntervalTime(1000); // 1 second
    ```
    
    ### URL Filtering
    
    ```java
    // Include patterns
    crawler.urlFilter.addInclude("https://example.com/.*");
    crawler.urlFilter.addInclude(".*\\.pdf$");
    
    // Exclude patterns  
    crawler.urlFilter.addExclude(".*\\.js$");
    crawler.urlFilter.addExclude(".*login.*");
    ```
    
    ## Supported Protocols and Formats
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
Back to top