- Sort Score
- Result 10 results
- Languages All
Results 31 - 40 of 45 for tiklab (0.27 sec)
-
README.md
Registered: Sun Dec 28 07:19:09 UTC 2025 - Last Modified: Thu Dec 25 11:01:37 UTC 2025 - 26.4K bytes - Viewed (0) -
CLAUDE.md
- **Build**: Maven 3.x - **License**: Apache 2.0 - **DI**: LastaFlute DI - **Repo**: https://github.com/codelibs/fess-crawler ### Tech Stack - **HTTP**: Apache HttpComponents 4.5+ - **Extraction**: Apache Tika 3.0+, POI 5.3+, PDFBox 3.0+ - **Testing**: JUnit 4, UTFlute, Mockito 5.7.0 - **Storage**: In-memory (default), OpenSearch (optional) ### Protocols - **HTTP/HTTPS**: Full crawling, cookies, auth, robots.txt
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 28 17:31:34 UTC 2025 - 10.7K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessStandardTransformer.java
public Logger getLogger() { return logger; } /** * Gets the appropriate extractor for the given response data. * Selects an extractor based on the MIME type or falls back to the Tika extractor. * * @param responseData the response data containing the document to extract * @return the extractor instance for processing the document
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 3.8K bytes - Viewed (0) -
docs/ja/docs/project-generation.md
* PostgreSQLデータベースのための**PGAdmin**。(PHPMyAdminとMySQLを使用できるように簡単に変更可能) * Celeryジョブ監視のための**Flower**。 * **Traefik**を使用してフロントエンドとバックエンド間をロードバランシング。同一ドメインに配置しパスで区切る、ただし、異なるコンテナで処理。 * Traefik統合。Let's Encrypt **HTTPS**証明書の自動生成を含む。 * GitLab **CI** (継続的インテグレーション)。フロントエンドおよびバックエンドテストを含む。 ## フルスタック FastAPI Couchbase
Registered: Sun Dec 28 07:19:09 UTC 2025 - Last Modified: Mon Jul 29 23:35:07 UTC 2024 - 7.1K bytes - Viewed (0) -
ADDING_NEW_LANGUAGE.md
3. **Fallback**: English (from `fess_label.properties` and `fess_message.properties`) ### Document Language Detection During crawling and indexing, Fess: 1. Detects language from document content using Apache Tika 2. Validates against `supported.languages` list 3. Creates language-specific fields (e.g., `content_ja`, `title_en`, `content_sv`) 4. Applies language-specific analyzers for better search results
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Nov 06 11:36:30 UTC 2025 - 10.4K bytes - Viewed (1) -
README.md
## Technology Stack - **Java**: 21+ (requires Java 21 or higher) - **Build System**: Maven 3.x - **DI Container**: LastaFlute DI - **HTTP Client**: Apache HttpComponents - **Content Extraction**: Apache Tika, Apache POI, PDFBox - **Testing**: JUnit 4, UTFlute, Testcontainers - **Storage Backends**: OpenSearch, Memory-based ## Quick Start ### Prerequisites - Java 21 or higher - Maven 3.6 or higher
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/job/CrawlJob.java
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 19.6K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractorTest.java
final String content = extractData.getContent(); CloseableUtil.closeQuietly(in); logger.info(content); assertTrue(content.contains("テスト")); } // TODO tika needs to support pdfbox 2.0 // public void test_getTika_pdf() { // final InputStream in = ResourceUtil // .getResourceAsStream("extractor/test.pdf");
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 30.6K bytes - Viewed (0) -
okhttp/src/jvmTest/resources/okhttp3/internal/publicsuffix/public_suffix_list.dat
gsj.bz // GitHub, Inc. // Submitted by Patrick Toomey <******@****.***> githubusercontent.com githubpreview.dev github.io // GitLab, Inc. // Submitted by Alex Hanselka <alex@gitlab.com> gitlab.io // Gitplac.si - https://gitplac.si // Submitted by Aljaž Starc <******@****.***> gitapp.si gitpage.si // Glitch, Inc : https://glitch.com
Registered: Fri Dec 26 11:42:13 UTC 2025 - Last Modified: Fri Dec 27 13:39:56 UTC 2024 - 309.7K bytes - Viewed (1) -
src/main/resources/fess_config.properties
# Type of hot thread monitoring (e.g., cpu). crawler.hotthread.type=cpu # Metadata fields to exclude from document content. crawler.metadata.content.excludes=resourceName,X-Parsed-By,Content-Encoding.*,Content-Type.*,X-TIKA.*,X-FESS.* # Mapping for document metadata names. crawler.metadata.name.mapping=\ title=title:string\n\ Title=title:string\n\ dc:title=title:string\n\ # html
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Dec 11 09:47:03 UTC 2025 - 54.8K bytes - Viewed (0)