Search Options

Results per page
Sort
Preferred Languages
Advance

Results 111 - 120 of 275 for Crawling (0.25 sec)

  1. src/main/java/org/codelibs/fess/mylasta/direction/FessConfig.java

        /** The key of the configuration. e.g. 100 */
        String PAGE_CRAWLING_INFO_PARAM_MAX_FETCH_SIZE = "page.crawling.info.param.max.fetch.size";
    
        /** The key of the configuration. e.g. 1000 */
        String PAGE_CRAWLING_INFO_MAX_FETCH_SIZE = "page.crawling.info.max.fetch.size";
    
        /** The key of the configuration. e.g. 100 */
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Sat Dec 13 02:21:17 UTC 2025
    - 525.7K bytes
    - Viewed (2)
  2. fess-crawler/src/test/java/org/codelibs/fess/crawler/builder/RequestDataBuilderTest.java

            RequestData data = RequestDataBuilder.newRequestData().url(null).build();
    
            assertNull(data.getUrl());
        }
    
        public void test_realWorldUsageExample1() {
            // Real-world example: crawling a web page
            RequestData data = RequestDataBuilder.newRequestData().get().url("https://example.com/article/12345").weight(1.0f).build();
    
            assertNotNull(data);
            assertEquals(Method.GET, data.getMethod());
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 10.9K bytes
    - Viewed (0)
  3. fess-crawler/src/test/java/org/codelibs/fess/crawler/interval/impl/HostIntervalControllerTest.java

    import org.dbflute.utflute.core.PlainTestCase;
    
    /**
     * @author hayato
     *
     */
    public class HostIntervalControllerTest extends PlainTestCase {
    
        /**
         * Test that crawling intervals for the same host work correctly.
         */
        public void test_delayBeforeProcessing() {
            // Number of concurrent tasks
            final int numTasks = 100;
            // Interval in milliseconds
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 11.4K bytes
    - Viewed (0)
  4. docs/de/README.md

    ![Administrationsoberfläche](https://fess.codelibs.org/_images/fess_admin_dashboard.png)
    
    Sie können in der Administrationsoberfläche (Web, Datei, Datenspeicher) Crawling-Ziele in den Crawler-Konfigurationsseiten registrieren und den Crawler manuell auf der [Scheduler-Seite](https://fess.codelibs.org/15.3/admin/scheduler-guide.html) starten.
    
    ## Migration von einem anderen Suchanbieter
    
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Tue Nov 11 22:42:32 UTC 2025
    - 7.8K bytes
    - Viewed (0)
  5. ADDING_NEW_LANGUAGE.md

    2. **Browser header**: `Accept-Language` header
    3. **Fallback**: English (from `fess_label.properties` and `fess_message.properties`)
    
    ### Document Language Detection
    
    During crawling and indexing, Fess:
    
    1. Detects language from document content using Apache Tika
    2. Validates against `supported.languages` list
    3. Creates language-specific fields (e.g., `content_ja`, `title_en`, `content_sv`)
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Thu Nov 06 11:36:30 UTC 2025
    - 10.4K bytes
    - Viewed (1)
  6. fess-crawler-opensearch/src/test/java/org/codelibs/fess/crawler/service/impl/OpenSearchUrlQueueServiceTest.java

            final String sessionId = "poll_session5";
            final int maxSize = 5;
            urlQueueService.setMaxCrawlingQueueSize(maxSize);
    
            // Insert more items than max crawling queue size
            final List<OpenSearchUrlQueue> urlQueueList = new ArrayList<>();
            for (int i = 1; i <= maxSize + 10; i++) {
                final OpenSearchUrlQueue urlQueue = new OpenSearchUrlQueue();
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Nov 20 08:40:57 UTC 2025
    - 14.3K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/RobotsTxtHelper.java

     * </ul>
     *
     * <p>References:</p>
     * <ul>
     * <li><a href="https://datatracker.ietf.org/doc/html/rfc9309">RFC 9309 - Robots Exclusion Protocol</a></li>
     * <li><a href="https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt">
     * Google's robots.txt Specification</a></li>
     * </ul>
     *
     * @author bowez
     * @author shinsuke
     *
     */
    public class RobotsTxtHelper {
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 14 12:52:01 UTC 2025
    - 11.4K bytes
    - Viewed (0)
  8. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

    import org.w3c.dom.Node;
    import org.xml.sax.InputSource;
    
    import jakarta.annotation.Resource;
    
    /**
     * The {@code HtmlTransformer} class is responsible for transforming HTML responses
     * during the crawling process. It extracts data, identifies child URLs, and handles
     * character set encoding.
     * <p>
     * This class extends {@link AbstractTransformer} and utilizes various helper classes
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Nov 29 07:42:33 UTC 2025
    - 30.5K bytes
    - Viewed (0)
  9. build-logic/kotlin-dsl-shared-runtime/src/main/kotlin/org/gradle/kotlin/dsl/internal/sharedruntime/codegen/ApiTypeProvider.kt

         * Test if a method is a prime declaration or an overrides that change the signature.
         *
         * There's no way to tell from the byte code that a method overrides the signature
         * of a parent declaration other than crawling up the type hierarchy.
         */
        private
        fun isSignificantDeclaration(methodNode: MethodNode): Boolean {
    
            if (methodNode.access.isSynthetic) return false
    
            if (!hasSuperType) return true
    Registered: Wed Dec 31 11:36:14 UTC 2025
    - Last Modified: Wed Mar 12 15:56:18 UTC 2025
    - 20.2K bytes
    - Viewed (0)
  10. fess-crawler/src/test/java/org/codelibs/fess/crawler/transformer/TransformerTest.java

            transformer.addTransformationRule("<[^>]+>", ""); // Remove HTML tags
            transformer.addTransformationRule("\\s+", " "); // Normalize whitespace
    
            // Simulate crawling response
            ResponseData responseData = new ResponseData();
            responseData.setUrl("http://example.com/page.html");
            responseData.setParentUrl("http://example.com/");
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 28K bytes
    - Viewed (0)
Back to top