Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 35 for extraction (0.07 sec)

  1. README.md

    ## Overview
    
    **Fess Crawler** is a powerful, flexible Java-based web crawling framework designed for enterprise-scale content extraction and processing. Built with a modular architecture, it supports multiple protocols (HTTP/HTTPS, File System, FTP, SMB, Cloud Storage) and provides extensive content extraction capabilities from various document formats.
    
    ### Key Features
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java

     *   <li>Handling resource names and content types</li>
     *   <li>Retrying extraction without resource name or content type if the initial attempt fails</li>
     *   <li>Extracting text from metadata if the main content extraction fails</li>
     *   <li>Reading content as plain text if all other methods fail</li>
     *   <li>Applying post-extraction filters</li>
     *   <li>Handling Tika exceptions, including zip bomb exceptions</li>
     * </ul>
     *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.7K bytes
    - Viewed (0)
  3. src/test/java/jcifs/smb1/smb1/SmbFileTest.java

                // Test file name extraction
                assertEquals("file.txt", new SmbFile("smb1://server/share/file.txt").getName());
                // Test directory name extraction (should include trailing slash)
                assertEquals("dir/", new SmbFile("smb1://server/share/dir/").getName());
                // Test share name extraction
                assertEquals("share/", new SmbFile("smb1://server/share/").getName());
    Registered: Sun Sep 07 00:10:21 UTC 2025
    - Last Modified: Thu Aug 14 05:31:44 UTC 2025
    - 8.5K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PasswordBasedExtractor.java

     *
     * <p>The extractor supports two types of password management:
     * <ul>
     *   <li>Static passwords configured via {@link #addPassword(String, String)}</li>
     *   <li>Dynamic passwords provided through extraction parameters</li>
     * </ul>
     *
     * <p>Passwords are matched against URLs or resource names using regular expression patterns.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 5.1K bytes
    - Viewed (0)
  5. src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java

            }
            return new URL(currentUrl);
        }
    
        /**
         * Gets child URL extraction rules from configuration.
         *
         * @param responseData the response data from crawling
         * @param resultData the result data
         * @return stream of tag-attribute pairs for URL extraction
         */
        @Override
    Registered: Thu Sep 04 12:52:25 UTC 2025
    - Last Modified: Thu Aug 07 03:06:29 UTC 2025
    - 54.4K bytes
    - Viewed (0)
  6. src/main/java/org/codelibs/fess/helper/DocumentHelper.java

    import org.codelibs.fess.crawler.exception.CrawlerSystemException;
    import org.codelibs.fess.crawler.exception.CrawlingAccessException;
    import org.codelibs.fess.crawler.extractor.Extractor;
    import org.codelibs.fess.crawler.extractor.impl.TikaExtractor;
    import org.codelibs.fess.crawler.processor.ResponseProcessor;
    import org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor;
    import org.codelibs.fess.crawler.rule.Rule;
    Registered: Thu Sep 04 12:52:25 UTC 2025
    - Last Modified: Thu Aug 07 03:06:29 UTC 2025
    - 17.2K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/ApiExtractor.java

         *
         * @param in the input stream to extract text from
         * @param params additional parameters
         * @return the extracted data
         * @throws ExtractException if extraction fails
         */
        @Override
        public ExtractData getText(final InputStream in, final Map<String, String> params) {
            if (logger.isDebugEnabled()) {
                logger.debug("Accessing {}", url);
            }
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 12.2K bytes
    - Viewed (0)
  8. src/main/java/org/codelibs/fess/crawler/transformer/AbstractFessFileTransformer.java

        /**
         * Get the extracted data.
         * @param extractor The extractor.
         * @param in The input stream.
         * @param params The parameters.
         * @return The extracted data.
         */
        protected ExtractData getExtractData(final Extractor extractor, final InputStream in, final Map<String, String> params) {
            try {
                return extractor.getText(in, params);
            } catch (final RuntimeException e) {
    Registered: Thu Sep 04 12:52:25 UTC 2025
    - Last Modified: Thu Aug 07 03:06:29 UTC 2025
    - 25.6K bytes
    - Viewed (0)
  9. src/test/java/jcifs/smb1/dcerpc/DcerpcMessageTest.java

    import org.junit.jupiter.api.Test;
    
    import jcifs.smb1.dcerpc.ndr.NdrBuffer;
    import jcifs.smb1.dcerpc.ndr.NdrException;
    
    /**
     * Unit tests for {@link DcerpcMessage}. The tests exercise flag handling,
     * result extraction, header encoding/decoding, and the round-trip of an
     * encode/decode operation.
     */
    public class DcerpcMessageTest {
    
        /**
         * A trivial concrete subclass used for testing. It simply writes a
    Registered: Sun Sep 07 00:10:21 UTC 2025
    - Last Modified: Thu Aug 14 07:14:38 UTC 2025
    - 7K bytes
    - Viewed (0)
  10. okhttp/src/commonJvmAndroid/kotlin/okhttp3/internal/platform/Platform.kt

     *
     * Supported on Android 5.0+.
     *
     * Supported on OpenJDK 8 via the JettyALPN-boot library or Conscrypt.
     *
     * Supported on OpenJDK 9+ via SSLParameters and SSLSocket features.
     *
     * ### Trust Manager Extraction
     *
     * Supported on Android 2.3+ and OpenJDK 7+. There are no public APIs to recover the trust
     * manager that was used to create an [SSLSocketFactory].
     *
     * Not supported by choice on JDK9+ due to access checks.
     *
    Registered: Fri Sep 05 11:42:10 UTC 2025
    - Last Modified: Mon Jul 28 07:33:49 UTC 2025
    - 8.1K bytes
    - Viewed (0)
Back to top