Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 22 for featureset (3.46 sec)

  1. fess-crawler/src/test/resources/extractor/markdown/test.md

    title: Sample Markdown Document
    author: John Doe
    date: 2025-01-15
    tags:
      - crawler
      - extractor
      - markdown
    ---
    
    # Introduction
    
    This is a sample Markdown document for testing the MarkdownExtractor.
    
    ## Features
    
    The extractor should handle:
    
    - YAML front matter extraction
    - Heading structure
    - **Bold text** and *italic text*
    - Lists and other formatting
    
    ### Code Examples
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 767 bytes
    - Viewed (0)
  2. fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractorTest.java

            // Verify plain text extraction
            assertTrue(content.contains("Introduction"));
            assertTrue(content.contains("This is a sample Markdown document"));
            assertTrue(content.contains("Features"));
            assertTrue(content.contains("Code Examples"));
        }
    
        public void test_frontMatterExtraction() {
            final InputStream in = ResourceUtil.getResourceAsStream("extractor/markdown/test.md");
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 6.4K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

     * </ol>
     * <p>
     * The class also provides methods for configuring features and properties of the
     * underlying DOM parser, as well as defining rules for extracting child URLs
     * from specific HTML tags and attributes.
     * </p>
     *
     * <p>
     * <b>Configuration:</b>
     * </p>
     * <ul>
     *   <li><b>featureMap:</b> A map of features to be set on the DOM parser.</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Nov 29 07:42:33 UTC 2025
    - 30.5K bytes
    - Viewed (0)
  4. CLAUDE.md

    try (ResponseData responseData = client.execute(requestData)) {
        // Process
    }  // Temp files auto-deleted
    ```
    
    ---
    
    ## Best Practices for AI Assistants
    
    ### When Adding Features
    
    1. Read existing code first (use symbol overview tools)
    2. Follow existing patterns
    3. Add tests
    4. Handle resources properly (try-with-resources)
    5. Consider thread safety
    6. Update JavaDoc
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/CrawlerClientFactory.java

     *
     * <p>This factory is typically initialized through dependency injection and can be
     * configured with initialization parameters that are passed to all registered clients.</p>
     *
     * <p>Features:</p>
     * <ul>
     *   <li>Pattern-based client mapping</li>
     *   <li>Ordered client registration</li>
     *   <li>Bulk client registration</li>
     *   <li>Automatic client initialization</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 7.3K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractor.java

    /**
     * Extracts text content and metadata from Markdown files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>YAML front matter metadata extraction</li>
     *   <li>Heading structure extraction</li>
     *   <li>Link URL extraction</li>
     *   <li>Code block content extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 8.2K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java

    /**
     * Extracts text content and metadata from CSV files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li>
     *   <li>Header row detection and extraction</li>
     *   <li>Column name to data value association</li>
     *   <li>Quoted field handling</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 12.8K bytes
    - Viewed (0)
  8. CLAUDE.md

    mvn formatter:format license:format     # Format code and apply licenses
    mvn clean jacoco:prepare-agent test jacoco:report  # Generate coverage report
    ```
    
    ### Adding New Features
    
    1. Read related source files and tests
    2. Write implementation following existing patterns
    3. Add comprehensive tests
    4. Run `mvn formatter:format license:format test`
    5. Update JavaDoc for changed/new classes
    Registered: Sat Dec 20 13:04:59 UTC 2025
    - Last Modified: Mon Nov 24 03:40:05 UTC 2025
    - 8.9K bytes
    - Viewed (0)
  9. src/main/java/org/codelibs/fess/ds/DataStoreFactory.java

         * This method searches for 'fess_ds++.xml' configuration files within JAR files
         * in the data store plugin directory and extracts component class names.
         *
         * <p>The method uses secure XML parsing features to prevent XXE attacks and
         * other XML-based vulnerabilities. Component class names are extracted from
         * the 'class' attribute of 'component' elements in the XML files.</p>
         *
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Fri Nov 28 16:29:12 UTC 2025
    - 9K bytes
    - Viewed (0)
  10. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractor.java

     * to prevent hanging on problematic PDF files. It also extracts metadata from the PDF
     * document and includes it in the extraction result.
     *
     * <p>Features:
     * <ul>
     *   <li>Text extraction from PDF pages</li>
     *   <li>Embedded document extraction</li>
     *   <li>Annotation extraction (file attachments)</li>
     *   <li>Metadata extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 12:19:14 UTC 2025
    - 12.8K bytes
    - Viewed (0)
Back to top