- Sort Score
- Num 10 results
- Language All
Results 1 - 10 of 36 for extractors (0.1 seconds)
-
CLAUDE.md
**ResponseProcessor**: `DefaultResponseProcessor`, `SitemapsResponseProcessor`, `NullResponseProcessor` **Transformer**: `HtmlTransformer`, `XmlTransformer`, `FileTransformer`, etc. **Extractor**: Weight-based selection (tries in descending weight order) ### Key Extractors `TikaExtractor` (1000+ formats), `PdfExtractor`, `MsWordExtractor`, `MsExcelExtractor`, `MsPowerPointExtractor`, `ZipExtractor`, `HtmlExtractor`, etc. **Registration**:
Created: Sat Dec 20 11:21:39 GMT 2025 - Last Modified: Fri Nov 28 17:31:34 GMT 2025 - 10.7K bytes - Click Count (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/ZipExtractor.java
final Extractor extractor = extractorFactory.getExtractor(mimeType); if (extractor != null) { try { final Map<String, String> map = new HashMap<>(); map.put(ExtractData.RESOURCE_NAME_KEY, filename); buf.append(extractor.getText(new IgnoreCloseInputStream(ais), map).getContent());
Created: Sat Dec 20 11:21:39 GMT 2025 - Last Modified: Thu Dec 11 08:38:29 GMT 2025 - 4.8K bytes - Click Count (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TarExtractor.java
final Extractor extractor = extractorFactory.getExtractor(mimeType); if (extractor != null) { try { final Map<String, String> map = new HashMap<>(); map.put(ExtractData.RESOURCE_NAME_KEY, filename); buf.append(extractor.getText(new IgnoreCloseInputStream(ais), map).getContent());
Created: Sat Dec 20 11:21:39 GMT 2025 - Last Modified: Thu Dec 11 08:38:29 GMT 2025 - 5.1K bytes - Click Count (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TextExtractor.java
*/ package org.codelibs.fess.crawler.extractor.impl; import java.io.InputStream; import java.util.Map; import org.codelibs.core.io.InputStreamUtil; import org.codelibs.fess.crawler.Constants; import org.codelibs.fess.crawler.entity.ExtractData; import org.codelibs.fess.crawler.exception.ExtractException; /** * Extracts text content from an input stream as plain text. */
Created: Sat Dec 20 11:21:39 GMT 2025 - Last Modified: Thu Dec 11 08:38:29 GMT 2025 - 2K bytes - Click Count (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/CsvExtractor.java
import org.codelibs.fess.crawler.Constants; import org.codelibs.fess.crawler.entity.ExtractData; import org.codelibs.fess.crawler.exception.ExtractException; /** * Extracts text content and metadata from CSV files. * This extractor provides better structured data extraction compared to Tika's generic text extraction. * * <p>Features: * <ul> * <li>Automatic delimiter detection (comma, tab, semicolon, pipe)</li>
Created: Sat Dec 20 11:21:39 GMT 2025 - Last Modified: Thu Dec 11 08:38:29 GMT 2025 - 12.8K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessFileTransformer.java
throw new FessSystemException("Could not find extractorFactory."); } final Extractor extractor = extractorFactory.getExtractor(responseData.getMimeType()); if (logger.isDebugEnabled()) { logger.debug("url={}, extractor={}", responseData.getUrl(), extractor); } return extractor; }
Created: Sat Dec 20 09:19:18 GMT 2025 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 3.5K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/AbstractFessFileTransformer.java
/** * Get the extracted data. * @param extractor The extractor. * @param in The input stream. * @param params The parameters. * @return The extracted data. */ protected ExtractData getExtractData(final Extractor extractor, final InputStream in, final Map<String, String> params) { try { return extractor.getText(in, params); } catch (final RuntimeException e) {Created: Sat Dec 20 09:19:18 GMT 2025 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 25.7K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessStandardTransformer.java
} /** * Gets the appropriate extractor for the given response data. * Selects an extractor based on the MIME type or falls back to the Tika extractor. * * @param responseData the response data containing the document to extract * @return the extractor instance for processing the document * @throws FessSystemException if no suitable extractor can be found */ @OverrideCreated: Sat Dec 20 09:19:18 GMT 2025 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 3.8K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/helper/DocumentHelper.java
import org.codelibs.fess.crawler.exception.CrawlerSystemException; import org.codelibs.fess.crawler.exception.CrawlingAccessException; import org.codelibs.fess.crawler.extractor.Extractor; import org.codelibs.fess.crawler.extractor.impl.TikaExtractor; import org.codelibs.fess.crawler.processor.ResponseProcessor; import org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor; import org.codelibs.fess.crawler.rule.Rule;
Created: Sat Dec 20 09:19:18 GMT 2025 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 17.4K bytes - Click Count (0) -
fastapi/security/api_key.py
""" API key authentication using a query parameter. This defines the name of the query parameter that should be provided in the request with the API key and integrates that into the OpenAPI documentation. It extracts the key value sent in the query parameter automatically and provides it as the dependency result. But it doesn't define how to send that API key to the client. ## UsageCreated: Sun Dec 28 07:19:09 GMT 2025 - Last Modified: Wed Dec 17 21:25:59 GMT 2025 - 9.6K bytes - Click Count (1)