- Sort Score
- Num 10 results
- Language All
Results 1 - 10 of 34 for Extraction (0.11 seconds)
-
README.md
## Overview **Fess Crawler** is a powerful, flexible Java-based web crawling framework designed for enterprise-scale content extraction and processing. Built with a modular architecture, it supports multiple protocols (HTTP/HTTPS, File System, FTP, SMB, Cloud Storage) and provides extensive content extraction capabilities from various document formats. ### Key Features
Created: Sun Apr 12 03:50:13 GMT 2026 - Last Modified: Sun Aug 31 05:32:52 GMT 2025 - 15.3K bytes - Click Count (0) -
src/test/java/jcifs/smb1/smb1/SmbFileTest.java
// Test file name extraction assertEquals("file.txt", new SmbFile("smb1://server/share/file.txt").getName()); // Test directory name extraction (should include trailing slash) assertEquals("dir/", new SmbFile("smb1://server/share/dir/").getName()); // Test share name extraction assertEquals("share/", new SmbFile("smb1://server/share/").getName());Created: Sun Apr 05 00:10:12 GMT 2026 - Last Modified: Thu Aug 14 05:31:44 GMT 2025 - 8.5K bytes - Click Count (0) -
internal/s3select/jstream/README.md
# [](https://godoc.org/github.com/bcicen/jstream) `jstream` is a streaming JSON parser and value extraction library for Go. Unlike most JSON parsers, `jstream` is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g: Using the below example document:
Created: Sun Apr 05 19:28:12 GMT 2026 - Last Modified: Mon Sep 23 19:35:41 GMT 2024 - 3.2K bytes - Click Count (0) -
CLAUDE.md
**Fess Crawler** is a Java-based web crawling framework for enterprise content extraction. ### Essential Info - **Language**: Java 21+ - **Build**: Maven 3.x - **License**: Apache 2.0 - **DI**: LastaFlute DI - **Repo**: https://github.com/codelibs/fess-crawler ### Tech Stack - **HTTP**: Apache HttpComponents 4.5+ and 5.x (switchable) - **Extraction**: Apache Tika, POI, PDFBox
Created: Sun Apr 12 03:50:13 GMT 2026 - Last Modified: Thu Mar 12 03:39:20 GMT 2026 - 8.1K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java
} return new URL(currentUrl); } /** * Gets child URL extraction rules from configuration. * * @param responseData the response data from crawling * @param resultData the result data * @return stream of tag-attribute pairs for URL extraction */ @OverrideCreated: Tue Mar 31 13:07:34 GMT 2026 - Last Modified: Thu Mar 12 01:46:45 GMT 2026 - 55.3K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessFileTransformer.java
import jakarta.annotation.PostConstruct; /** * File transformer implementation for the Fess search engine. * This transformer handles file-based document transformation and content extraction * using the Fess file transformation process with support for various file types. * * <p>It extends AbstractFessFileTransformer to provide specialized file processing
Created: Tue Mar 31 13:07:34 GMT 2026 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 3.5K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/crawler/transformer/FessStandardTransformer.java
import org.codelibs.fess.util.ComponentUtil; import jakarta.annotation.PostConstruct; /** * Standard transformer implementation for the Fess search engine. * This transformer handles document transformation and content extraction using * the standard Fess file transformation process with support for various content types. * * <p>It extends AbstractFessFileTransformer to provide file-specific transformation
Created: Tue Mar 31 13:07:34 GMT 2026 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 3.8K bytes - Click Count (0) -
.teamcity/scripts/CheckWrapper.java
private static final Pattern ALLOWED_WRAPPER_VERSION = Pattern.compile("^[0-9.]+(-(rc|milestone|m)-[0-9]+)?$"); // Keep the same extraction semantics as the old sed: // sed 's/.*gradle-\(.*\)-[a-z]*\.[a-z]*/\1/' private static final Pattern WRAPPER_VERSION_EXTRACT = Pattern.compile(".*gradle-(.*)-[a-z]*\\.[a-z]*");Created: Wed Apr 01 11:36:16 GMT 2026 - Last Modified: Tue Jan 20 03:53:25 GMT 2026 - 6.4K bytes - Click Count (0) -
impl/maven-cli/src/test/java/org/apache/maven/cling/invoker/mvnup/goals/GAVUtilsTest.java
/** * Tests Artifact extraction, computation, and parent resolution functionality. */ @DisplayName("GAVUtils") class GAVUtilsTest { @BeforeEach void setUp() {} private UpgradeContext createMockContext() { return TestUtils.createMockContext(); } @Nested @DisplayName("Artifact Extraction") class GAVExtractionTests { @TestCreated: Sun Apr 05 03:35:12 GMT 2026 - Last Modified: Tue Nov 18 18:03:26 GMT 2025 - 17.3K bytes - Click Count (0) -
src/main/java/org/codelibs/fess/helper/ThemeHelper.java
import org.codelibs.fess.helper.PluginHelper.ArtifactType; import org.codelibs.fess.util.ResourceUtil; /** * Helper class for managing theme installation and uninstallation. * Handles the extraction and deployment of theme files from JAR artifacts. */ public class ThemeHelper { private static final Logger logger = LogManager.getLogger(ThemeHelper.class); /** * Default constructor for ThemeHelper.Created: Tue Mar 31 13:07:34 GMT 2026 - Last Modified: Fri Nov 28 16:29:12 GMT 2025 - 7.1K bytes - Click Count (0)