- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 12 for fixtures (0.1 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/Constants.java
*/ public static final String FEATURE_EXTERNAL_GENERAL_ENTITIES = "http://xml.org/sax/features/external-general-entities"; /** * Feature for external parameter entities in XML. */ public static final String FEATURE_EXTERNAL_PARAMETER_ENTITIES = "http://xml.org/sax/features/external-parameter-entities"; static { DEFAULT_CHARSET = Charset.defaultCharset(); }Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 3.6K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java
} propertyMap.put(key, value); } /** * Gets the map of parser features. * * @return the feature map */ public Map<String, String> getFeatureMap() { return featureMap; } /** * Sets the map of parser features. * * @param featureMap the feature map to set */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
* </ol> * <p> * The class also provides methods for configuring features and properties of the * underlying DOM parser, as well as defining rules for extracting child URLs * from specific HTML tags and attributes. * </p> * * <p> * <b>Configuration:</b> * </p> * <ul> * <li><b>featureMap:</b> A map of features to be set on the DOM parser.</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 28.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlExtractor.java
this.htmlTagPattern = htmlTagPattern; } /** * Gets the map of parser features. * * @return the feature map */ public Map<String, String> getFeatureMap() { return featureMap; } /** * Sets the map of parser features. * * @param featureMap the feature map to set */Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 9.3K bytes - Viewed (0) -
README.md
### Key Features - **Multi-Protocol Support**: HTTP/HTTPS, File System, FTP, SMB/CIFS, Cloud Storage (MinIO, S3) - **Comprehensive Content Extraction**: Office documents, PDFs, archives, images, audio/video files
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/CrawlerClientFactory.java
* * <p>This factory is typically initialized through dependency injection and can be * configured with initialization parameters that are passed to all registered clients.</p> * * <p>Features:</p> * <ul> * <li>Pattern-based client mapping</li> * <li>Ordered client registration</li> * <li>Bulk client registration</li> * <li>Automatic client initialization</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/FaultTolerantClient.java
* retry counts and intervals between attempts. * * <p>The client supports a RequestListener interface to monitor the request lifecycle and handle * exceptions during retries.</p> * * <p>Key features:</p> * <ul> * <li>Configurable maximum retry attempts</li> * <li>Adjustable interval between retries</li> * <li>Exception tracking and aggregation</li> * <li>Request lifecycle monitoring through listener</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 7.8K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java
* * <p>The robots.txt protocol is implemented according to the standard specification, * supporting pattern matching for user agents, path-based access control, and crawl delay settings.</p> * * <p>Key features:</p> * <ul> * <li>Supports multiple user-agent directives with pattern matching</li> * <li>Handles Allow and Disallow rules for path-based access control</li> * <li>Manages crawl delay settings per user agent</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractor.java
* to prevent hanging on problematic PDF files. It also extracts metadata from the PDF * document and includes it in the extraction result. * * <p>Features: * <ul> * <li>Text extraction from PDF pages</li> * <li>Embedded document extraction</li> * <li>Annotation extraction (file attachments)</li> * <li>Metadata extraction</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 12.7K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/CrawlerTest.java
@Override protected void setUp() throws Exception { super.setUp(); final Map<String, String> featureMap = newHashMap(); featureMap.put("http://xml.org/sax/features/namespaces", "false"); final Map<String, String> propertyMap = newHashMap(); final Map<String, String> childUrlRuleMap = newHashMap(); childUrlRuleMap.put("//A", "href");Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 19.1K bytes - Viewed (0)