- Sort Score
- Result 10 results
- Languages All
Results 1 - 9 of 9 for sites (0.01 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapFile.java
* crawler may only retrieve Sitemaps that were modified since a certain * date. This incremental Sitemap fetching mechanism allows for the rapid * discovery of new URLs on very large sites. */ private String lastmod; /** * Creates a new SitemapFile instance. */ public SitemapFile() { // Default constructor } /* * (non-Javadoc)Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 4.4K bytes - Viewed (1) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java
* the same site, so you can use this tag to increase the likelihood that * your most important pages are present in a search index. * * Also, please note that assigning a high priority to all of the URLs on * your site is not likely to help you. Since the priority is relative, it * is only used to select between URLs on your site. */ private String priority;Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 6.5K bytes - Viewed (0) -
README.md
```java // Create multiple crawler instances Crawler crawler1 = container.getComponent("crawler"); crawler1.setSessionId("session1"); crawler1.addUrl("https://site1.com"); Crawler crawler2 = container.getComponent("crawler"); crawler2.setSessionId("session2"); crawler2.addUrl("https://site2.com"); // Execute concurrently crawler1.setBackground(true); crawler2.setBackground(true); String sessionId1 = crawler1.execute();
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
README.md
- Verify analyzer configurations **Performance Issues** - Increase thread pool size in SuggesterBuilder - Optimize batch sizes for indexing operations - Review OpenSearch cluster performance **Memory Usage** - Configure appropriate JVM heap settings - Monitor index sizes and optimize mappings - Use streaming for large data imports ### Debug Logging Enable debug logging for detailed troubleshooting:
Registered: Fri Sep 19 09:08:11 UTC 2025 - Last Modified: Sun Aug 31 03:31:14 UTC 2025 - 12.1K bytes - Viewed (1) -
fess-crawler/src/test/resources/ajax/index.html
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Oct 11 02:16:55 UTC 2015 - 1.5K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/filter/UrlFilterTest.java
urlFilter.init(sessionId); // Create multiple threads adding patterns Thread thread1 = new Thread(() -> { for (int i = 0; i < 100; i++) { urlFilter.addInclude("https://site" + i + ".com/.*"); } }); Thread thread2 = new Thread(() -> { for (int i = 0; i < 100; i++) { urlFilter.addExclude(".*\\.type" + i + "$"); }
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Wed Sep 03 14:42:53 UTC 2025 - 19K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/util/TextUtil.java
* </ul> * * <p>The {@link TextNormalizeContext} class provides a fluent API to configure the text * normalization process, including setting initial buffer capacity, maximum term sizes, * duplicate term removal, and custom space characters. * * <p>Example usage: * <pre>{@code * Reader reader = new StringReader("Example text to normalize."); * String normalizedText = TextUtil.normalizeText(reader)Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 12K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/TikaExtractor.java
* <ul> * <li>Output encoding</li> * <li>Maximum compression ratio and uncompression size</li> * <li>Initial buffer size</li> * <li>Memory size for temporary file storage</li> * <li>Maximum term sizes for alphanumeric and symbolic terms</li> * <li>Custom Tika configuration</li> * <li>Tesseract OCR configuration for image-based documents</li> * <li>PDF Parser configuration for PDF documents</li> * </ul> *
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 30.7K bytes - Viewed (0) -
fess-crawler/src/test/java/org/codelibs/fess/crawler/CrawlerContextTest.java
// Simulate crawling progress for (int i = 0; i < 100; i++) { crawlerContext.incrementAndGetAccessCount(); crawlerContext.getRobotsTxtUrlSet().add("http://site" + i + ".com/robots.txt"); } // Add sitemaps crawlerContext.addSitemaps(new String[] { "http://example.com/sitemap.xml" }); // Complete crawlingRegistered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sat Sep 06 04:15:37 UTC 2025 - 25.6K bytes - Viewed (0)