- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 25 for Detection (0.04 sec)
-
src/main/java/org/codelibs/fess/helper/LanguageHelper.java
} return getSupportedLanguage(result.getLanguage()); } /** * Returns the text to be used for language detection. * * @param text The original text. * @return The text for language detection. */ protected String getDetectText(final String text) { final String result; if (text.length() <= maxTextLength) {Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 6.9K bytes - Viewed (0) -
CLAUDE.md
``` ### Helpers **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps **SitemapsHelper**: Sitemap XML parsing, index handling **MimeTypeHelper**: MIME detection via Tika **EncodingHelper**: Charset detection with BOM **UrlConvertHelper**: URL normalization --- ## Development Workflow ### Build Commands ```bash mvn clean install # Build all
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 28 17:31:34 UTC 2025 - 10.7K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java
* This extractor provides better structured data extraction compared to Tika's generic text extraction. * * <p>Features: * <ul> * <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li> * <li>Header row detection and extraction</li> * <li>Column name to data value association</li> * <li>Quoted field handling</li> * <li>Column names as metadata</li>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 12.8K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessTransformer.java
u = StringUtil.EMPTY; } } return u; } /** * Decodes a URL as a name using appropriate character encoding. * Handles encoding detection from parent URLs and configuration settings. * * @param url the URL to decode * @param escapePlus whether to escape plus signs before decodingRegistered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Dec 11 09:47:03 UTC 2025 - 14.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
} /** * Gets the preload size for charset detection. * * @return the preload size in bytes */ public int getPreloadSizeForCharset() { return preloadSizeForCharset; } /** * Sets the preload size for charset detection. * * @param preloadSizeForCharset the preload size in bytes to set */Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Nov 29 07:42:33 UTC 2025 - 30.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/smb/SmbClient.java
* It also integrates with other Fess Crawler components, such as {@link ContentLengthHelper} and * {@link MimeTypeHelper}, to handle content length checks and MIME type detection. * </p> * * <p> * The class uses JCIFS properties to configure the SMB connection. * </p> * * <p> * Usage example: * </p> * * <pre> * {@codeRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 23.4K bytes - Viewed (3) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/gcs/GcsClient.java
* * <p>Features: * <ul> * <li>Automatic initialization of GCS client</li> * <li>Support for HEAD and GET operations</li> * <li>Content length validation</li> * <li>MIME type detection</li> * <li>Handling of large files through temporary file storage</li> * <li>Object metadata retrieval</li> * <li>Directory listing capabilities</li> * </ul> *
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 17.5K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/helper/ProtocolHelper.java
} /** * Checks if the given URL is a file path protocol that requires directory and permission handling. * Used for incremental crawling directory detection and file permission processing. * * @param url the URL to check * @return true if the URL uses a file path protocol (smb, smb1, file, ftp, s3, gcs) */Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Dec 12 13:58:40 UTC 2025 - 12.4K bytes - Viewed (1) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/s3/S3Client.java
* * <p>Features: * <ul> * <li>Automatic initialization of AWS S3 client</li> * <li>Support for HEAD and GET operations</li> * <li>Content length validation</li> * <li>MIME type detection</li> * <li>Handling of large files through temporary file storage</li> * <li>Object metadata and tags retrieval</li> * <li>Directory listing capabilities</li> * </ul> *
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 21.4K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/smb1/SmbClient.java
* It also integrates with other Fess Crawler components, such as {@link ContentLengthHelper} and * {@link MimeTypeHelper}, to handle content length checks and MIME type detection. * </p> * * <p> * The class uses JCIFS properties to configure the SMB connection. * </p> * * <p> * Usage example: * </p> * * <pre> * {@codeRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Dec 11 08:38:29 UTC 2025 - 23.3K bytes - Viewed (0)