- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 22 for featureset (3.46 sec)
-
fess-crawler/src/test/resources/extractor/markdown/test.md
title: Sample Markdown Document author: John Doe date: 2025-01-15 tags: - crawler - extractor - markdown --- # Introduction This is a sample Markdown document for testing the MarkdownExtractor. ## Features The extractor should handle: - YAML front matter extraction - Heading structure - **Bold text** and *italic text* - Lists and other formatting ### Code Examples
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Nov 23 03:46:53 UTC 2025 - 767 bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractorTest.java
// Verify plain text extraction assertTrue(content.contains("Introduction")); assertTrue(content.contains("This is a sample Markdown document")); assertTrue(content.contains("Features")); assertTrue(content.contains("Code Examples")); } public void test_frontMatterExtraction() { final InputStream in = ResourceUtil.getResourceAsStream("extractor/markdown/test.md");Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 6.4K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
* </ol> * <p> * The class also provides methods for configuring features and properties of the * underlying DOM parser, as well as defining rules for extracting child URLs * from specific HTML tags and attributes. * </p> * * <p> * <b>Configuration:</b> * </p> * <ul> * <li><b>featureMap:</b> A map of features to be set on the DOM parser.</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Nov 29 07:42:33 UTC 2025 - 30.5K bytes - Viewed (0) -
CLAUDE.md
try (ResponseData responseData = client.execute(requestData)) { // Process } // Temp files auto-deleted ``` --- ## Best Practices for AI Assistants ### When Adding Features 1. Read existing code first (use symbol overview tools) 2. Follow existing patterns 3. Add tests 4. Handle resources properly (try-with-resources) 5. Consider thread safety 6. Update JavaDoc
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 28 17:31:34 UTC 2025 - 10.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/CrawlerClientFactory.java
* * <p>This factory is typically initialized through dependency injection and can be * configured with initialization parameters that are passed to all registered clients.</p> * * <p>Features:</p> * <ul> * <li>Pattern-based client mapping</li> * <li>Ordered client registration</li> * <li>Bulk client registration</li> * <li>Automatic client initialization</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 7.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/MarkdownExtractor.java
/** * Extracts text content and metadata from Markdown files. * This extractor provides better structured data extraction compared to Tika's generic text extraction. * * <p>Features: * <ul> * <li>YAML front matter metadata extraction</li> * <li>Heading structure extraction</li> * <li>Link URL extraction</li> * <li>Code block content extraction</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Nov 23 03:46:53 UTC 2025 - 8.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java
/** * Extracts text content and metadata from CSV files. * This extractor provides better structured data extraction compared to Tika's generic text extraction. * * <p>Features: * <ul> * <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li> * <li>Header row detection and extraction</li> * <li>Column name to data value association</li> * <li>Quoted field handling</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 12.8K bytes - Viewed (0) -
CLAUDE.md
mvn formatter:format license:format # Format code and apply licenses mvn clean jacoco:prepare-agent test jacoco:report # Generate coverage report ``` ### Adding New Features 1. Read related source files and tests 2. Write implementation following existing patterns 3. Add comprehensive tests 4. Run `mvn formatter:format license:format test` 5. Update JavaDoc for changed/new classes
Registered: Sat Dec 20 13:04:59 UTC 2025 - Last Modified: Mon Nov 24 03:40:05 UTC 2025 - 8.9K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/ds/DataStoreFactory.java
* This method searches for 'fess_ds++.xml' configuration files within JAR files * in the data store plugin directory and extracts component class names. * * <p>The method uses secure XML parsing features to prevent XXE attacks and * other XML-based vulnerabilities. Component class names are extracted from * the 'class' attribute of 'component' elements in the XML files.</p> *
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 9K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractor.java
* to prevent hanging on problematic PDF files. It also extracts metadata from the PDF * document and includes it in the extraction result. * * <p>Features: * <ul> * <li>Text extraction from PDF pages</li> * <li>Embedded document extraction</li> * <li>Annotation extraction (file attachments)</li> * <li>Metadata extraction</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Nov 23 12:19:14 UTC 2025 - 12.8K bytes - Viewed (0)