- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 35 for extraction (0.07 sec)
-
README.md
## Overview **Fess Crawler** is a powerful, flexible Java-based web crawling framework designed for enterprise-scale content extraction and processing. Built with a modular architecture, it supports multiple protocols (HTTP/HTTPS, File System, FTP, SMB, Cloud Storage) and provides extensive content extraction capabilities from various document formats. ### Key Features
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java
* <li>Handling resource names and content types</li> * <li>Retrying extraction without resource name or content type if the initial attempt fails</li> * <li>Extracting text from metadata if the main content extraction fails</li> * <li>Reading content as plain text if all other methods fail</li> * <li>Applying post-extraction filters</li> * <li>Handling Tika exceptions, including zip bomb exceptions</li> * </ul> *
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 30.7K bytes - Viewed (0) -
src/test/java/jcifs/smb1/smb1/SmbFileTest.java
// Test file name extraction assertEquals("file.txt", new SmbFile("smb1://server/share/file.txt").getName()); // Test directory name extraction (should include trailing slash) assertEquals("dir/", new SmbFile("smb1://server/share/dir/").getName()); // Test share name extraction assertEquals("share/", new SmbFile("smb1://server/share/").getName());Registered: Sun Sep 07 00:10:21 UTC 2025 - Last Modified: Thu Aug 14 05:31:44 UTC 2025 - 8.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PasswordBasedExtractor.java
* * <p>The extractor supports two types of password management: * <ul> * <li>Static passwords configured via {@link #addPassword(String, String)}</li> * <li>Dynamic passwords provided through extraction parameters</li> * </ul> * * <p>Passwords are matched against URLs or resource names using regular expression patterns.
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 5.1K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java
} return new URL(currentUrl); } /** * Gets child URL extraction rules from configuration. * * @param responseData the response data from crawling * @param resultData the result data * @return stream of tag-attribute pairs for URL extraction */ @OverrideRegistered: Thu Sep 04 12:52:25 UTC 2025 - Last Modified: Thu Aug 07 03:06:29 UTC 2025 - 54.4K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/helper/DocumentHelper.java
import org.codelibs.fess.crawler.exception.CrawlerSystemException; import org.codelibs.fess.crawler.exception.CrawlingAccessException; import org.codelibs.fess.crawler.extractor.Extractor; import org.codelibs.fess.crawler.extractor.impl.TikaExtractor; import org.codelibs.fess.crawler.processor.ResponseProcessor; import org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor; import org.codelibs.fess.crawler.rule.Rule;
Registered: Thu Sep 04 12:52:25 UTC 2025 - Last Modified: Thu Aug 07 03:06:29 UTC 2025 - 17.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/ApiExtractor.java
* * @param in the input stream to extract text from * @param params additional parameters * @return the extracted data * @throws ExtractException if extraction fails */ @Override public ExtractData getText(final InputStream in, final Map<String, String> params) { if (logger.isDebugEnabled()) { logger.debug("Accessing {}", url); }Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 12.2K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/crawler/transformer/AbstractFessFileTransformer.java
/** * Get the extracted data. * @param extractor The extractor. * @param in The input stream. * @param params The parameters. * @return The extracted data. */ protected ExtractData getExtractData(final Extractor extractor, final InputStream in, final Map<String, String> params) { try { return extractor.getText(in, params); } catch (final RuntimeException e) {Registered: Thu Sep 04 12:52:25 UTC 2025 - Last Modified: Thu Aug 07 03:06:29 UTC 2025 - 25.6K bytes - Viewed (0) -
src/test/java/jcifs/smb1/dcerpc/DcerpcMessageTest.java
import org.junit.jupiter.api.Test; import jcifs.smb1.dcerpc.ndr.NdrBuffer; import jcifs.smb1.dcerpc.ndr.NdrException; /** * Unit tests for {@link DcerpcMessage}. The tests exercise flag handling, * result extraction, header encoding/decoding, and the round-trip of an * encode/decode operation. */ public class DcerpcMessageTest { /** * A trivial concrete subclass used for testing. It simply writes aRegistered: Sun Sep 07 00:10:21 UTC 2025 - Last Modified: Thu Aug 14 07:14:38 UTC 2025 - 7K bytes - Viewed (0) -
okhttp/src/commonJvmAndroid/kotlin/okhttp3/internal/platform/Platform.kt
* * Supported on Android 5.0+. * * Supported on OpenJDK 8 via the JettyALPN-boot library or Conscrypt. * * Supported on OpenJDK 9+ via SSLParameters and SSLSocket features. * * ### Trust Manager Extraction * * Supported on Android 2.3+ and OpenJDK 7+. There are no public APIs to recover the trust * manager that was used to create an [SSLSocketFactory]. * * Not supported by choice on JDK9+ due to access checks. *
Registered: Fri Sep 05 11:42:10 UTC 2025 - Last Modified: Mon Jul 28 07:33:49 UTC 2025 - 8.1K bytes - Viewed (0)