Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 16 for structures (0.29 sec)

  1. src/main/java/org/codelibs/fess/util/GsaConfigParser.java

                throw new GsaConfigException("Failed to parse XML file.", e);
            }
        }
    
        /**
         * SAX event handler called at the beginning of document parsing.
         * Initializes internal data structures for processing the GSA configuration.
         *
         * @throws SAXException if a SAX error occurs during initialization
         */
        @Override
        public void startDocument() throws SAXException {
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Fri Nov 28 16:29:12 UTC 2025
    - 21.6K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/JsonExtractor.java

    /**
     * Extracts text content and metadata from JSON files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>Structured text extraction with key-value pairs</li>
     *   <li>Top-level field extraction as metadata</li>
     *   <li>Nested structure flattening with configurable depth</li>
     *   <li>Array element extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 9.7K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractor.java

    /**
     * Extracts text content and metadata from Markdown files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>YAML front matter metadata extraction</li>
     *   <li>Heading structure extraction</li>
     *   <li>Link URL extraction</li>
     *   <li>Code block content extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 8.2K bytes
    - Viewed (0)
  4. fess-crawler/src/test/resources/extractor/markdown/test.md

      - markdown
    ---
    
    # Introduction
    
    This is a sample Markdown document for testing the MarkdownExtractor.
    
    ## Features
    
    The extractor should handle:
    
    - YAML front matter extraction
    - Heading structure
    - **Bold text** and *italic text*
    - Lists and other formatting
    
    ### Code Examples
    
    Here is some inline `code` and a code block:
    
    ```java
    public class Example {
        public static void main(String[] args) {
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 03:46:53 UTC 2025
    - 767 bytes
    - Viewed (0)
  5. src/test/java/org/codelibs/fess/suggest/SuggesterResourceLoadingTest.java

                assertTrue("Content should be valid", content.length() > 0);
                assertTrue("Should not contain encoding errors", !content.contains("\uFFFD"));
    
                // Verify JSON structure is intact
                final int openBraces = content.length() - content.replace("{", "").length();
                final int closeBraces = content.length() - content.replace("}", "").length();
    Registered: Sat Dec 20 13:04:59 UTC 2025
    - Last Modified: Mon Nov 24 03:40:05 UTC 2025
    - 9.6K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/interval/impl/AbstractIntervalController.java

    import org.codelibs.fess.crawler.exception.CrawlerSystemException;
    import org.codelibs.fess.crawler.interval.IntervalController;
    
    /**
     * An abstract base class for implementing {@link IntervalController}.
     * Provides a common structure for handling delays at different stages of the crawling process.
     * It encapsulates the delay logic and exception handling, allowing subclasses to focus on
     * defining the specific delay behavior for each stage.
     *
     * <p>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Nov 20 08:58:39 UTC 2025
    - 4.8K bytes
    - Viewed (0)
  7. src/main/java/org/codelibs/fess/util/FacetResponse.java

    import com.google.common.io.BaseEncoding;
    
    /**
     * Response object for faceted search results containing query counts and field facets.
     * This class processes OpenSearch aggregations to provide structured facet information
     * for search result filtering and navigation.
     */
    public class FacetResponse {
        /**
         * Map containing query facet counts, where keys are decoded query strings
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Sun Nov 23 11:39:05 UTC 2025
    - 5.3K bytes
    - Viewed (0)
  8. CLAUDE.md

    ### Content Formats
    
    Office (Word, Excel, PowerPoint), PDF, Archives (ZIP, TAR, GZ), HTML, XML, JSON, Media (audio/video metadata), Images (EXIF/IPTC/XMP)
    
    ---
    
    ## Architecture
    
    ### Module Structure
    
    ```
    fess-crawler-parent/
    ├── fess-crawler/              # Core framework
    ├── fess-crawler-lasta/        # LastaFlute DI integration
    └── fess-crawler-opensearch/   # OpenSearch backend
    ```
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java

    import org.codelibs.fess.crawler.entity.ExtractData;
    import org.codelibs.fess.crawler.exception.ExtractException;
    
    /**
     * Extracts text content and metadata from CSV files.
     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li>
     *   <li>Header row detection and extraction</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 12.8K bytes
    - Viewed (0)
  10. CLAUDE.md

    ## Technical Details
    
    - **Java Version**: 17+
    - **Dependencies**:
      - Apache Commons IO 2.19.0 (runtime)
      - JUnit 4.13.2 (test)
    - **Module Name**: `org.codelibs.curl4j`
    - **Package Structure**: `org.codelibs.curl.*`
    - **Build Tool**: Maven 3.x
    - **Test Coverage**: JaCoCo plugin enabled
    
    ## Code Style
    
    - Uses external Eclipse formatter configuration from CodeLibs
    Registered: Sat Dec 20 09:13:53 UTC 2025
    - Last Modified: Mon Nov 24 03:10:07 UTC 2025
    - 3.2K bytes
    - Viewed (0)
Back to top