- Sort Score
- Result 10 results
- Languages All
Results 1 - 10 of 68 for Pdf (0.05 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractor.java
/** * PdfExtractor extracts text content from PDF files using Apache PDFBox. * It supports password-protected PDFs and can extract embedded documents and annotations. * * <p>The extractor runs text extraction in a separate thread with a configurable timeout * to prevent hanging on problematic PDF files. It also extracts metadata from the PDF * document and includes it in the extraction result. * * <p>Features:
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Nov 23 12:19:14 UTC 2025 - 12.8K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractorTest.java
url = "http://test.com/hoge1.pdf"; resourceName = null; params.put(ExtractData.URL, url); params.put(ExtractData.RESOURCE_NAME_KEY, resourceName); assertNull(pdfExtractor.getPassword(params)); url = "http://test.com/hoge1.pdf"; resourceName = "hoge2.pdf"; params.put(ExtractData.URL, url);Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 7.6K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractorTest.java
url = "http://test.com/hoge1.pdf"; resourceName = null; assertNull(tikaExtractor.getPassword(createParams(url, resourceName))); url = "http://test.com/hoge1.pdf"; resourceName = "hoge2.pdf"; assertNull(tikaExtractor.getPassword(createParams(url, resourceName))); url = null; resourceName = "hoge2.pdf";Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 30.6K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/entity/ExtractDataTest.java
data.putValue(ExtractData.RESOURCE_NAME_KEY, "test.pdf"); data.putValue(ExtractData.URL, "https://example.com/test.pdf"); data.putValues(ExtractData.FILE_PASSWORDS, new String[] { "pass1", "pass2" }); assertEquals("test.pdf", data.getValues(ExtractData.RESOURCE_NAME_KEY)[0]); assertEquals("https://example.com/test.pdf", data.getValues(ExtractData.URL)[0]);Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 9.9K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/impl/MimeTypeHelperImplTest.java
"hoge.pptx"); assertContentType("image/jpeg", null, "hoge.jpg"); assertContentType("image/gif", null, "hoge.gif"); assertContentType("application/pdf", "extractor/test.pdf", "hoge.pdf"); assertContentType("application/gzip", "extractor/gz/test.tar.gz", "hoge.tar.gz"); assertContentType("application/zip", "extractor/zip/test.zip", "hoge.zip");
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Mar 15 06:52:00 UTC 2025 - 11.6K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/FilenameExtractorEnhancedTest.java
final Map<String, String> params = new HashMap<>(); params.put(ExtractData.RESOURCE_NAME_KEY, "test-document.pdf"); final ExtractData result = filenameExtractor.getText(in, params); assertNotNull(result); assertEquals("test-document.pdf", result.getContent()); } /** * Test extraction with null parameters map. */
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 7K bytes - Viewed (0) -
src/test/java/jcifs/smb1/util/MimeMapTest.java
void testCaseInsensitiveExtensions() throws IOException { assertEquals("application/pdf", mimeMap.getMimeType("PDF")); assertEquals("application/pdf", mimeMap.getMimeType("Pdf")); assertEquals("application/pdf", mimeMap.getMimeType("pDf")); assertEquals("text/html", mimeMap.getMimeType("HTML")); assertEquals("text/html", mimeMap.getMimeType("HtMl"));
Registered: Sat Dec 20 13:44:44 UTC 2025 - Last Modified: Thu Aug 14 05:31:44 UTC 2025 - 9.1K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/RobotsTxtHelperTest.java
} // Test WildcardBot - wildcard patterns // Disallow: /*.pdf$ - should block .pdf files but not .pdf with query params assertFalse(robotsTxt.allows("/document.pdf", "WildcardBot")); assertFalse(robotsTxt.allows("/files/report.pdf", "WildcardBot")); assertTrue(robotsTxt.allows("/document.pdf?download=true", "WildcardBot")); // $ means exact endRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 20.6K bytes - Viewed (0) -
src/main/resources/fess_thumbnail.xml
<property name="commandList"> ["${path}/generate-thumbnail", "pdf", "${url}", "${outputFile}"] </property> <property name="generatorList"> ["${path}/generate-thumbnail"] </property> <postConstruct name="addCondition"> <arg>"mimetype"</arg> <arg>"application/pdf" </arg> </postConstruct> <postConstruct name="register"></postConstruct> </component>
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Dec 04 08:02:36 UTC 2025 - 6K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/ExtractorBuilder.java
* </p> * * <p> * Example usage: * </p> * * <pre> * {@code * try (InputStream in = new FileInputStream("example.pdf")) { * ExtractData extractData = new ExtractorBuilder(crawlerContainer, in, new HashMap<>()) * .mimeType("application/pdf") * .filename("example.pdf") * .maxContentLength(1024 * 1024) * .extract(); * * String content = extractData.getContent();
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.1K bytes - Viewed (0)