Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 9 of 9 for msoffice (0.58 sec)

  1. fess-crawler/src/test/java/org/codelibs/fess/crawler/helper/impl/MimeTypeHelperImplTest.java

            assertContentType("application/vnd.ms-powerpoint", "extractor/msoffice/test.ppt", "h&oge.ppt");
            assertContentType("application/vnd.ms-powerpoint", "extractor/msoffice/test.ppt", "h?oge.ppt");
            assertContentType("application/vnd.ms-powerpoint", "extractor/msoffice/test.ppt", "******@****.***");
            assertContentType("application/vnd.ms-powerpoint", "extractor/msoffice/test.ppt", "h:oge.ppt");
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Mar 15 06:52:00 UTC 2025
    - 11.6K bytes
    - Viewed (0)
  2. fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/ExtractorResourceManagementTest.java

            final AtomicBoolean streamClosed = new AtomicBoolean(false);
    
            try (final InputStream originalStream = ResourceUtil.getResourceAsStream("extractor/msoffice/test.doc")) {
                final InputStream trackableStream = createTrackableInputStream(originalStream, streamClosed);
                final ExtractData result = extractor.getText(trackableStream, null);
    
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Mon Nov 24 03:59:47 UTC 2025
    - 10.4K bytes
    - Viewed (0)
  3. fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractorTest.java

            logger.info(content);
            assertTrue(content.contains("ใƒ†ใ‚นใƒˆ"));
        }
    
        public void test_getTika_msword() {
            final InputStream in = ResourceUtil.getResourceAsStream("extractor/msoffice/test.doc");
            final ExtractData extractData = tikaExtractor.getText(in, null);
            final String content = extractData.getContent();
            CloseableUtil.closeQuietly(in);
            logger.info(content);
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.6K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/JodExtractor.java

    import org.codelibs.fess.crawler.exception.CrawlerSystemException;
    import org.codelibs.fess.crawler.exception.ExtractException;
    import org.codelibs.fess.crawler.extractor.Extractor;
    import org.jodconverter.core.office.OfficeException;
    import org.jodconverter.core.office.OfficeManager;
    import org.jodconverter.local.LocalConverter;
    
    import jakarta.annotation.PostConstruct;
    import jakarta.annotation.PreDestroy;
    
    /**
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Nov 23 12:19:14 UTC 2025
    - 10.4K bytes
    - Viewed (0)
  5. src/main/resources/fess_label_en.properties

    labels.user_pager=Pager
    labels.pager=Pager
    labels.user_street=Street
    labels.street=Street
    labels.user_postalCode=Postal Code
    labels.postalCode=Postal Code
    labels.user_physicalDeliveryOfficeName=Office
    labels.physicalDeliveryOfficeName=Office
    labels.user_destinationIndicator=Destination Indicator
    labels.destinationIndicator=Destination Indicator
    labels.user_internationaliSDNNumber=International ISDN Number
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Sat Dec 13 02:21:17 UTC 2025
    - 44K bytes
    - Viewed (0)
  6. README.md

    - **FTP**: FTP server crawling with authentication
    - **SMB/CIFS**: Windows network shares
    - **Storage**: Cloud storage systems (MinIO, S3-compatible)
    
    ### Content Formats
    
    #### Office Documents
    - Microsoft Office (Word, Excel, PowerPoint)
    - OpenOffice/LibreOffice documents
    - RTF, WordPerfect
    
    #### PDFs and Images
    - PDF documents (text and metadata extraction)
    - Images (JPEG, PNG, GIF, TIFF, BMP)
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
  7. fess-crawler/pom.xml

    			<artifactId>tika-parser-microsoft-module</artifactId>
    			<version>${tika.version}</version>
    		</dependency>
    		<dependency>
    			<groupId>org.apache.tika</groupId>
    			<artifactId>tika-parser-miscoffice-module</artifactId>
    			<version>${tika.version}</version>
    		</dependency>
    		<dependency>
    			<groupId>org.apache.tika</groupId>
    			<artifactId>tika-parser-news-module</artifactId>
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Sat Dec 20 06:34:36 UTC 2025
    - 12.1K bytes
    - Viewed (0)
  8. src/main/resources/fess_label.properties

    labels.user_pager=Pager
    labels.pager=Pager
    labels.user_street=Street
    labels.street=Street
    labels.user_postalCode=Postal Code
    labels.postalCode=Postal Code
    labels.user_physicalDeliveryOfficeName=Office
    labels.physicalDeliveryOfficeName=Office
    labels.user_destinationIndicator=Destination Indicator
    labels.destinationIndicator=Destination Indicator
    labels.user_internationaliSDNNumber=International ISDN Number
    Registered: Sat Dec 20 09:19:18 UTC 2025
    - Last Modified: Sat Dec 13 02:21:17 UTC 2025
    - 44K bytes
    - Viewed (0)
  9. CLAUDE.md

    - **File**: Local/network file systems
    - **FTP**: With authentication
    - **SMB/CIFS**: Windows shares (SMB1/SMB2+)
    - **Storage**: MinIO/S3-compatible
    
    ### Content Formats
    
    Office (Word, Excel, PowerPoint), PDF, Archives (ZIP, TAR, GZ), HTML, XML, JSON, Media (audio/video metadata), Images (EXIF/IPTC/XMP)
    
    ---
    
    ## Architecture
    
    ### Module Structure
    
    ```
    fess-crawler-parent/
    Registered: Sat Dec 20 11:21:39 UTC 2025
    - Last Modified: Fri Nov 28 17:31:34 UTC 2025
    - 10.7K bytes
    - Viewed (0)
Back to top