Search Options

Results per page
Sort
Preferred Languages
Advance

Results 11 - 19 of 19 for CrawlerContext (0.07 sec)

  1. fess-crawler/src/test/java/org/codelibs/fess/crawler/CrawlerThreadTest.java

            super.setUp();
    
            crawlerThread = new CrawlerThread();
            crawlerContext = new CrawlerContext();
            crawlerContext.sessionId = "test-session";
            crawlerContext.numOfThread = 1;
            crawlerContext.maxThreadCheckCount = 10;
            crawlerContext.maxDepth = 3;
            crawlerContext.maxAccessCount = 0;
    
            urlQueueService = mock(UrlQueueService.class);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 18.3K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/processor/impl/DefaultResponseProcessor.java

         *
         * @param crawlerContext the crawler context
         * @return true if access count is within limit, false otherwise
         */
        protected boolean checkAccessCount(final CrawlerContext crawlerContext) {
            if (crawlerContext.getMaxAccessCount() > 0) {
                return crawlerContext.incrementAndGetAccessCount() <= crawlerContext.getMaxAccessCount();
            }
            return true;
        }
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 12.5K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/CrawlingParameterUtil.java

         * Otherwise, the provided {@code crawlerContext} is set in the thread-local storage.
         *
         * @param crawlerContext the {@link CrawlerContext} to be set for the current thread, or {@code null} to remove the context.
         */
        public static void setCrawlerContext(final CrawlerContext crawlerContext) {
            if (crawlerContext == null) {
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Nov 22 13:28:22 UTC 2025
    - 6.4K bytes
    - Viewed (0)
  4. README.md

    ```java
    // Set maximum number of URLs to crawl
    crawler.crawlerContext.setMaxAccessCount(1000);
    
    // Set number of crawler threads
    crawler.crawlerContext.setNumOfThread(10);
    
    // Set maximum crawl depth
    crawler.crawlerContext.setMaxDepth(3);
    
    // Set request interval (politeness)
    crawler.crawlerContext.setDefaultIntervalTime(1000); // 1 second
    ```
    
    ### URL Filtering
    
    ```java
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
  5. fess-crawler/src/test/java/org/codelibs/fess/crawler/client/http/HcHttpClientTest.java

            final String url = "http://localhost:7070/hoge.html";
            try {
                final CrawlerContext crawlerContext = new CrawlerContext();
                final String sessionId = "id1";
                urlFilter.init(sessionId);
                crawlerContext.setUrlFilter(urlFilter);
                CrawlingParameterUtil.setCrawlerContext(crawlerContext);
                httpClient.init();
                httpClient.processRobotsTxt(url);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 11.7K bytes
    - Viewed (0)
  6. fess-crawler-opensearch/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java

                assertNotSame(crawler1.crawlerContext, crawler2.crawlerContext);
    
                for (int i = 0; i < 100; i++) {
                    if (crawler1.crawlerContext.getStatus() == CrawlerStatus.RUNNING) {
                        break;
                    }
                    Thread.sleep(50);
                }
                assertEquals(CrawlerStatus.RUNNING, crawler1.crawlerContext.getStatus());
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 7.7K bytes
    - Viewed (0)
  7. CLAUDE.md

    void cleanup(String sessionId)  // Clean up session
    void stop()                     // Stop gracefully
    ```
    
    **Key Fields**: `crawlerContext`, `urlFilter`, `intervalController`, `clientFactory`, `ruleManager`
    
    ### CrawlerContext (`CrawlerContext.java`)
    
    Execution context and configuration.
    
    **Important Fields**:
    ```java
    String sessionId                // Format: yyyyMMddHHmmssSSS
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  8. src/main/java/org/codelibs/fess/crawler/FessCrawlerThread.java

                        log(logHelper, LogType.NOT_MODIFIED, crawlerContext, urlQueue);
    
                        responseData.setExecutionTime(systemHelper.getCurrentTimeAsLong() - startTime);
                        responseData.setParentUrl(urlQueue.getParentUrl());
                        responseData.setSessionId(crawlerContext.getSessionId());
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Thu Dec 11 09:47:03 UTC 2025
    - 19.5K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/HcHttpClient.java

                // not support robots.txt
                return;
            }
    
            // crawler context
            final CrawlerContext crawlerContext = CrawlingParameterUtil.getCrawlerContext();
            if (crawlerContext == null) {
                // wrong state
                return;
            }
    
            final int idx = url.indexOf('/', url.indexOf("://") + 3);
            String hostUrl;
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 12:19:14 UTC 2025
    - 53.7K bytes
    - Viewed (0)
Back to top