Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 25 for Detection (0.04 sec)

  1. src/main/java/org/codelibs/fess/helper/LanguageHelper.java

            }
            return getSupportedLanguage(result.getLanguage());
        }
    
        /**
         * Returns the text to be used for language detection.
         *
         * @param text The original text.
         * @return The text for language detection.
         */
        protected String getDetectText(final String text) {
            final String result;
            if (text.length() <= maxTextLength) {
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Fri Nov 28 16:29:12 UTC 2025
    - 6.9K bytes
    - Viewed (0)
  2. CLAUDE.md

    ```
    
    ### Helpers
    
    **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps
    **SitemapsHelper**: Sitemap XML parsing, index handling
    **MimeTypeHelper**: MIME detection via Tika
    **EncodingHelper**: Charset detection with BOM
    **UrlConvertHelper**: URL normalization
    
    ---
    
    ## Development Workflow
    
    ### Build Commands
    
    ```bash
    mvn clean install              # Build all
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java

     * This extractor provides better structured data extraction compared to Tika's generic text extraction.
     *
     * <p>Features:
     * <ul>
     *   <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li>
     *   <li>Header row detection and extraction</li>
     *   <li>Column name to data value association</li>
     *   <li>Quoted field handling</li>
     *   <li>Column names as metadata</li>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 12.8K bytes
    - Viewed (0)
  4. src/main/java/org/codelibs/fess/crawler/transformer/FessTransformer.java

                    u = StringUtil.EMPTY;
                }
            }
            return u;
        }
    
        /**
         * Decodes a URL as a name using appropriate character encoding.
         * Handles encoding detection from parent URLs and configuration settings.
         *
         * @param url the URL to decode
         * @param escapePlus whether to escape plus signs before decoding
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Thu Dec 11 09:47:03 UTC 2025
    - 14.1K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

        }
    
        /**
         * Gets the preload size for charset detection.
         *
         * @return the preload size in bytes
         */
        public int getPreloadSizeForCharset() {
            return preloadSizeForCharset;
        }
    
        /**
         * Sets the preload size for charset detection.
         *
         * @param preloadSizeForCharset the preload size in bytes to set
         */
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Nov 29 07:42:33 UTC 2025
    - 30.5K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/smb/SmbClient.java

     * It also integrates with other Fess Crawler components, such as {@link ContentLengthHelper} and
     * {@link MimeTypeHelper}, to handle content length checks and MIME type detection.
     * </p>
     *
     * <p>
     * The class uses JCIFS properties to configure the SMB connection.
     * </p>
     *
     * <p>
     * Usage example:
     * </p>
     *
     * <pre>
     * {@code
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 23.4K bytes
    - Viewed (3)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/gcs/GcsClient.java

     *
     * <p>Features:
     * <ul>
     *   <li>Automatic initialization of GCS client</li>
     *   <li>Support for HEAD and GET operations</li>
     *   <li>Content length validation</li>
     *   <li>MIME type detection</li>
     *   <li>Handling of large files through temporary file storage</li>
     *   <li>Object metadata retrieval</li>
     *   <li>Directory listing capabilities</li>
     * </ul>
     *
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 17.5K bytes
    - Viewed (0)
  8. src/main/java/org/codelibs/fess/helper/ProtocolHelper.java

        }
    
        /**
         * Checks if the given URL is a file path protocol that requires directory and permission handling.
         * Used for incremental crawling directory detection and file permission processing.
         *
         * @param url the URL to check
         * @return true if the URL uses a file path protocol (smb, smb1, file, ftp, s3, gcs)
         */
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Fri Dec 12 13:58:40 UTC 2025
    - 12.4K bytes
    - Viewed (1)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/s3/S3Client.java

     *
     * <p>Features:
     * <ul>
     *   <li>Automatic initialization of AWS S3 client</li>
     *   <li>Support for HEAD and GET operations</li>
     *   <li>Content length validation</li>
     *   <li>MIME type detection</li>
     *   <li>Handling of large files through temporary file storage</li>
     *   <li>Object metadata and tags retrieval</li>
     *   <li>Directory listing capabilities</li>
     * </ul>
     *
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 21.4K bytes
    - Viewed (0)
  10. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/smb1/SmbClient.java

     * It also integrates with other Fess Crawler components, such as {@link ContentLengthHelper} and
     * {@link MimeTypeHelper}, to handle content length checks and MIME type detection.
     * </p>
     *
     * <p>
     * The class uses JCIFS properties to configure the SMB connection.
     * </p>
     *
     * <p>
     * Usage example:
     * </p>
     *
     * <pre>
     * {@code
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Dec 11 08:38:29 UTC 2025
    - 23.3K bytes
    - Viewed (0)
Back to top