Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 13 for tika (0.02 sec)

  1. fess-crawler/src/main/resources/org/codelibs/fess/crawler/mime/tika-mimetypes.xml

      <!--  an OLE2 (application/x-tika-msoffice) container. -->
      <!--  The are logically subclasses of (application/x-tika-ooxml),
            but their containers are literally subclasses
            of (application/x-tika-msoffice) -->
      <mime-type type="application/x-tika-ooxml-protected">
        <sub-class-of type="application/x-tika-msoffice"/>
        <_comment>Password Protected OOXML File</_comment>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Mar 13 08:18:01 UTC 2025
    - 320.1K bytes
    - Viewed (1)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/ExtractData.java

    import java.util.Map;
    import java.util.Set;
    
    import org.apache.tika.metadata.ClimateForcast;
    import org.apache.tika.metadata.CreativeCommons;
    import org.apache.tika.metadata.Geographic;
    import org.apache.tika.metadata.HttpHeaders;
    import org.apache.tika.metadata.Message;
    import org.apache.tika.metadata.TIFF;
    import org.apache.tika.metadata.TikaCoreProperties;
    import org.apache.tika.metadata.TikaMimeKeys;
    
    /**
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 3.8K bytes
    - Viewed (0)
  3. fess-crawler/pom.xml

    			<artifactId>tika-parser-html-module</artifactId>
    			<version>${tika.version}</version>
    		</dependency>
    		<dependency>
    			<groupId>org.apache.tika</groupId>
    			<artifactId>tika-parser-image-module</artifactId>
    			<version>${tika.version}</version>
    		</dependency>
    		<dependency>
    			<groupId>org.apache.tika</groupId>
    			<artifactId>tika-parser-mail-module</artifactId>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 11.3K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/impl/MimeTypeHelperImpl.java

    import java.io.BufferedInputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.util.HashMap;
    import java.util.Map;
    
    import org.apache.tika.metadata.Metadata;
    import org.apache.tika.mime.MediaType;
    import org.apache.tika.mime.MimeTypes;
    import org.apache.tika.mime.MimeTypesFactory;
    import org.codelibs.core.lang.StringUtil;
    import org.codelibs.fess.crawler.entity.ExtractData;
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 6.5K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java

    import org.apache.logging.log4j.Logger;
    import org.apache.tika.config.TikaConfig;
    import org.apache.tika.detect.Detector;
    import org.apache.tika.exception.TikaException;
    import org.apache.tika.extractor.EmbeddedDocumentExtractor;
    import org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor;
    import org.apache.tika.io.TemporaryResources;
    import org.apache.tika.io.TikaInputStream;
    import org.apache.tika.metadata.Metadata;
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.7K bytes
    - Viewed (0)
  6. fess-crawler-lasta/src/main/resources/crawler/extractor.xml

    				"application/x-texinfo",
    				"application/x-tika-msoffice",
    				"application/x-tika-msoffice-embedded",
    				"application/x-tika-msoffice-embedded;format=ole10_native",
    				"application/x-tika-msoffice-embedded;format=comp_obj",
    				"application/x-tika-msworks-spreadsheet",
    				"application/x-tika-ooxml",
    				"application/x-tika-ooxml-protected",
    				"application/x-tika-staroffice",
    				"application/x-uc2-compressed",
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Aug 01 21:40:30 UTC 2020
    - 49K bytes
    - Viewed (0)
  7. README.md

    ## Technology Stack
    
    - **Java**: 21+ (requires Java 21 or higher)
    - **Build System**: Maven 3.x
    - **DI Container**: LastaFlute DI
    - **HTTP Client**: Apache HttpComponents
    - **Content Extraction**: Apache Tika, Apache POI, PDFBox
    - **Testing**: JUnit 4, UTFlute, Testcontainers
    - **Storage Backends**: OpenSearch, Memory-based
    
    ## Quick Start
    
    ### Prerequisites
    
    - Java 21 or higher
    - Maven 3.6 or higher
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Aug 31 05:32:52 UTC 2025
    - 15.3K bytes
    - Viewed (0)
  8. fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractorTest.java

            final String content = extractData.getContent();
            CloseableUtil.closeQuietly(in);
            logger.info(content);
            assertTrue(content.contains("ใƒ†ใ‚นใƒˆ"));
        }
    
        // TODO tika needs to support pdfbox 2.0
        //    public void test_getTika_pdf() {
        //        final InputStream in = ResourceUtil
        //                .getResourceAsStream("extractor/test.pdf");
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 30.6K bytes
    - Viewed (0)
  9. docs/id/docs/tutorial/path-params.md

    "parameter" atau "variabel" path didefinisikan dengan sintaksis Python format string:
    
    {* ../../docs_src/path_params/tutorial001.py hl[6:7] *}
    
    Nilai parameter path `item_id` akan dikirim ke fungsi sebagai argument `item_id`:
    
    Jika anda menjalankan contoh berikut dan kunjungi <a href="http://127.0.0.1:8000/items/foo" class="external-link" target="_blank">http://127.0.0.1:8000/items/foo</a>, anda akan melihat respon:
    
    ```JSON
    {"item_id":"foo"}
    ```
    
    Registered: Sun Sep 07 07:19:17 UTC 2025
    - Last Modified: Sun Aug 31 10:29:01 UTC 2025
    - 8.8K bytes
    - Viewed (0)
  10. docs/id/docs/index.md

    * <a href="https://jinja.palletsprojects.com" target="_blank"><code>jinja2</code></a> - Dibutuhkan jika anda menggunakan konfigurasi template bawaan.
    Registered: Sun Sep 07 07:19:17 UTC 2025
    - Last Modified: Sun Aug 31 10:49:48 UTC 2025
    - 20.5K bytes
    - Viewed (0)
Back to top