Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 28 for featureset (0.82 sec)

  1. fess-crawler/src/test/resources/extractor/markdown/test.md

    title: Sample Markdown Document
    author: John Doe
    date: 2025-01-15
    tags:
      - crawler
      - extractor
      - markdown
    ---
    
    # Introduction
    
    This is a sample Markdown document for testing the MarkdownExtractor.
    
    ## Features
    
    The extractor should handle:
    
    - YAML front matter extraction
    - Heading structure
    - **Bold text** and *italic text*
    - Lists and other formatting
    
    ### Code Examples
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 767 bytes
    - Viewed (0)
  2. MIGRATION.md

    **Web Crawling**:
    - **Admin Path**: Crawler > Web
    - **Supports**: HTTP/HTTPS websites
    - **Features**: JavaScript rendering, authentication, custom headers
    
    **File Crawling**:
    - **Admin Path**: Crawler > File
    - **Supports**: SMB, FTP, local file systems
    - **Features**: Access control, file type filtering
    
    **Data Store Crawling**:
    - **Admin Path**: Crawler > Data Store
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Thu Nov 06 12:40:11 UTC 2025
    - 23.2K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java

            }
    
            propertyMap.put(key, value);
        }
    
        /**
         * Gets the map of parser features.
         *
         * @return the feature map
         */
        public Map<String, String> getFeatureMap() {
            return featureMap;
        }
    
        /**
         * Sets the map of parser features.
         *
         * @param featureMap the feature map to set
         */
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Oct 04 08:47:19 UTC 2025
    - 10.4K bytes
    - Viewed (0)
  4. fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractorTest.java

            // Verify plain text extraction
            assertTrue(content.contains("Introduction"));
            assertTrue(content.contains("This is a sample Markdown document"));
            assertTrue(content.contains("Features"));
            assertTrue(content.contains("Code Examples"));
        }
    
        public void test_frontMatterExtraction() {
            final InputStream in = ResourceUtil.getResourceAsStream("extractor/markdown/test.md");
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 6.4K bytes
    - Viewed (0)
  5. README.md

    curl4j
    [![Java CI with Maven](https://github.com/codelibs/curl4j/actions/workflows/maven.yml/badge.svg)](https://github.com/codelibs/curl4j/actions/workflows/maven.yml)
    =====
    
    A simple cURL-like Java HTTP client.
    
    ## Features
    
    - Fluent API for building HTTP requests (GET, POST, PUT, DELETE, HEAD, OPTIONS, CONNECT, TRACE)
    - Support for query parameters, headers, body (String or stream), compression, SSL configuration, proxies, and timeouts
    Registered: Sat Dec 20 09:13:53 UTC 2025
    - Last Modified: Thu Nov 20 13:34:13 UTC 2025
    - 2.5K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

     * </ol>
     * <p>
     * The class also provides methods for configuring features and properties of the
     * underlying DOM parser, as well as defining rules for extracting child URLs
     * from specific HTML tags and attributes.
     * </p>
     *
     * <p>
     * <b>Configuration:</b>
     * </p>
     * <ul>
     *   <li><b>featureMap:</b> A map of features to be set on the DOM parser.</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Nov 29 07:42:33 UTC 2025
    - 30.5K bytes
    - Viewed (0)
  7. CLAUDE.md

    try (ResponseData responseData = client.execute(requestData)) {
        // Process
    }  // Temp files auto-deleted
    ```
    
    ---
    
    ## Best Practices for AI Assistants
    
    ### When Adding Features
    
    1. Read existing code first (use symbol overview tools)
    2. Follow existing patterns
    3. Add tests
    4. Handle resources properly (try-with-resources)
    5. Consider thread safety
    6. Update JavaDoc
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  8. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlExtractor.java

            this.htmlTagPattern = htmlTagPattern;
        }
    
        /**
         * Gets the map of parser features.
         *
         * @return the feature map
         */
        public Map<String, String> getFeatureMap() {
            return featureMap;
        }
    
        /**
         * Sets the map of parser features.
         *
         * @param featureMap the feature map to set
         */
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Oct 04 08:47:19 UTC 2025
    - 9.3K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/CrawlerClientFactory.java

     *
     * <p>This factory is typically initialized through dependency injection and can be
     * configured with initialization parameters that are passed to all registered clients.</p>
     *
     * <p>Features:</p>
     * <ul>
     *   <li>Pattern-based client mapping</li>
     *   <li>Ordered client registration</li>
     *   <li>Bulk client registration</li>
     *   <li>Automatic client initialization</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 7.3K bytes
    - Viewed (0)
  10. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractor.java

    /**
     * Extracts text content and metadata from Markdown files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>YAML front matter metadata extraction</li>
     *   <li>Heading structure extraction</li>
     *   <li>Link URL extraction</li>
     *   <li>Code block content extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 8.2K bytes
    - Viewed (0)
Back to top