- Sort Score
- Result 10 results
- Languages All
Results 111 - 120 of 275 for Crawling (0.25 sec)
-
src/main/java/org/codelibs/fess/mylasta/direction/FessConfig.java
/** The key of the configuration. e.g. 100 */ String PAGE_CRAWLING_INFO_PARAM_MAX_FETCH_SIZE = "page.crawling.info.param.max.fetch.size"; /** The key of the configuration. e.g. 1000 */ String PAGE_CRAWLING_INFO_MAX_FETCH_SIZE = "page.crawling.info.max.fetch.size"; /** The key of the configuration. e.g. 100 */Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Sat Dec 13 02:21:17 UTC 2025 - 525.7K bytes - Viewed (2) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/builder/RequestDataBuilderTest.java
RequestData data = RequestDataBuilder.newRequestData().url(null).build(); assertNull(data.getUrl()); } public void test_realWorldUsageExample1() { // Real-world example: crawling a web page RequestData data = RequestDataBuilder.newRequestData().get().url("https://example.com/article/12345").weight(1.0f).build(); assertNotNull(data); assertEquals(Method.GET, data.getMethod());
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 10.9K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/interval/impl/HostIntervalControllerTest.java
import org.dbflute.utflute.core.PlainTestCase; /** * @author hayato * */ public class HostIntervalControllerTest extends PlainTestCase { /** * Test that crawling intervals for the same host work correctly. */ public void test_delayBeforeProcessing() { // Number of concurrent tasks final int numTasks = 100; // Interval in millisecondsRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Mon Nov 24 03:59:47 UTC 2025 - 11.4K bytes - Viewed (0) -
docs/de/README.md
 Sie können in der Administrationsoberfläche (Web, Datei, Datenspeicher) Crawling-Ziele in den Crawler-Konfigurationsseiten registrieren und den Crawler manuell auf der [Scheduler-Seite](https://fess.codelibs.org/15.3/admin/scheduler-guide.html) starten. ## Migration von einem anderen Suchanbieter
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Tue Nov 11 22:42:32 UTC 2025 - 7.8K bytes - Viewed (0) -
ADDING_NEW_LANGUAGE.md
2. **Browser header**: `Accept-Language` header 3. **Fallback**: English (from `fess_label.properties` and `fess_message.properties`) ### Document Language Detection During crawling and indexing, Fess: 1. Detects language from document content using Apache Tika 2. Validates against `supported.languages` list 3. Creates language-specific fields (e.g., `content_ja`, `title_en`, `content_sv`)
Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Nov 06 11:36:30 UTC 2025 - 10.4K bytes - Viewed (1) -
fess-crawler-opensearch/src/test/java/org/codelibs/fess/crawler/service/impl/OpenSearchUrlQueueServiceTest.java
final String sessionId = "poll_session5"; final int maxSize = 5; urlQueueService.setMaxCrawlingQueueSize(maxSize); // Insert more items than max crawling queue size final List<OpenSearchUrlQueue> urlQueueList = new ArrayList<>(); for (int i = 1; i <= maxSize + 10; i++) { final OpenSearchUrlQueue urlQueue = new OpenSearchUrlQueue();
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Nov 20 08:40:57 UTC 2025 - 14.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/RobotsTxtHelper.java
* </ul> * * <p>References:</p> * <ul> * <li><a href="https://datatracker.ietf.org/doc/html/rfc9309">RFC 9309 - Robots Exclusion Protocol</a></li> * <li><a href="https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt"> * Google's robots.txt Specification</a></li> * </ul> * * @author bowez * @author shinsuke * */ public class RobotsTxtHelper {
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 14 12:52:01 UTC 2025 - 11.4K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
import org.w3c.dom.Node; import org.xml.sax.InputSource; import jakarta.annotation.Resource; /** * The {@code HtmlTransformer} class is responsible for transforming HTML responses * during the crawling process. It extracts data, identifies child URLs, and handles * character set encoding. * <p> * This class extends {@link AbstractTransformer} and utilizes various helper classesRegistered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Nov 29 07:42:33 UTC 2025 - 30.5K bytes - Viewed (0) -
build-logic/kotlin-dsl-shared-runtime/src/main/kotlin/org/gradle/kotlin/dsl/internal/sharedruntime/codegen/ApiTypeProvider.kt
* Test if a method is a prime declaration or an overrides that change the signature. * * There's no way to tell from the byte code that a method overrides the signature * of a parent declaration other than crawling up the type hierarchy. */ private fun isSignificantDeclaration(methodNode: MethodNode): Boolean { if (methodNode.access.isSynthetic) return false if (!hasSuperType) return trueRegistered: Wed Dec 31 11:36:14 UTC 2025 - Last Modified: Wed Mar 12 15:56:18 UTC 2025 - 20.2K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/transformer/TransformerTest.java
transformer.addTransformationRule("<[^>]+>", ""); // Remove HTML tags transformer.addTransformationRule("\\s+", " "); // Normalize whitespace // Simulate crawling response ResponseData responseData = new ResponseData(); responseData.setUrl("http://example.com/page.html"); responseData.setParentUrl("http://example.com/");
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 28K bytes - Viewed (0)