- Sort Score
- Result 10 results
- Languages All
Results 11 - 20 of 90 for crawl (0.02 sec)
-
MIGRATION.md
- `GET /backup/export` - Export configurations - `PUT /documents/bulk` - Bulk document import - `GET /webconfig` - List web crawl configs - `POST /webconfig` - Create web crawl config - `PUT /webconfig/{id}` - Update web crawl config - `DELETE /webconfig/{id}` - Delete web crawl config - (Similar CRUD for `fileconfig`, `dataconfig`, `labeltype`, etc.) **Example - List All Web Configs**: ```bashRegistered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Nov 06 12:40:11 UTC 2025 - 23.2K bytes - Viewed (0) -
build-logic-commons/code-quality-rules/src/main/resources/checkstyle/checkstyle-api.xml
~ See the License for the specific language governing permissions and ~ limitations under the License. --> <!DOCTYPE module PUBLIC "-//Puppy Crawl//DTD Check Configuration 1.2//EN" "http://www.puppycrawl.com/dtds/configuration_1_2.dtd"> <module name="Checker"> <module name="SuppressionFilter"> <property name="file" value="${config_loc}/suppressions.xml"/>Registered: Wed Dec 31 11:36:14 UTC 2025 - Last Modified: Thu Nov 17 23:20:14 UTC 2022 - 1.6K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/app/web/admin/fileconfig/CreateForm.java
@Size(max = 200) public String name; /** The description of the file configuration (maximum 1000 characters). */ @Size(max = 1000) public String description; /** The file paths to crawl (required, must be valid file URIs). */ @Required @UriType(protocolType = ProtocolType.FILE) @CustomSize(maxKey = "form.admin.max.input.size") public String paths;Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Jul 17 08:28:31 UTC 2025 - 5.6K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/helper/DataIndexHelper.java
* specified in the configIdList parameter. * * @param sessionId unique identifier for this crawling session * @param configIdList list of data configuration IDs to crawl */ public void crawl(final String sessionId, final List<String> configIdList) { final List<DataConfig> configList = ComponentUtil.getCrawlingConfigHelper().getDataConfigListByIds(configIdList);Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 19K bytes - Viewed (0) -
CLAUDE.md
extractorFactory.addExtractor("text/html", htmlExtractor, 2); // Weight 2 extractorFactory.addExtractor("text/html", tikaExtractor, 1); // Fallback ``` ### Helpers **RobotsTxtHelper**: RFC 9309 parsing, user-agent matching, crawl-delay, sitemaps **SitemapsHelper**: Sitemap XML parsing, index handling **MimeTypeHelper**: MIME detection via Tika **EncodingHelper**: Charset detection with BOM **UrlConvertHelper**: URL normalization ---
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 28 17:31:34 UTC 2025 - 10.7K bytes - Viewed (0) -
README.md
</components> ``` ### Crawler Context Configuration ```java // Set maximum number of URLs to crawl crawler.crawlerContext.setMaxAccessCount(1000); // Set number of crawler threads crawler.crawlerContext.setNumOfThread(10); // Set maximum crawl depth crawler.crawlerContext.setMaxDepth(3); // Set request interval (politeness)
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java
* command. Even though search engine crawlers may consider this information * when making decisions, they may crawl pages marked "hourly" less * frequently than that, and they may crawl pages marked "yearly" more * frequently than that. Crawlers may periodically crawl pages marked * "never" so that they can handle unexpected changes to those pages. */ private String changefreq; /**Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Thu Nov 13 13:34:36 UTC 2025 - 9.1K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/app/web/admin/dataconfig/EditForm.java
* This form extends CreateForm to include fields necessary for updating existing data config entries, * including tracking information for optimistic locking and audit trails. * Data configs define how to crawl and extract data from databases, CSV files, and other data sources. * */ public class EditForm extends CreateForm { /** * Creates a new EditForm instance. */ public EditForm() {Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Jul 17 08:28:31 UTC 2025 - 2.3K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/app/web/api/admin/webconfig/SearchBody.java
/** * Default constructor. */ public SearchBody() { super(); } /** Name of the web crawling configuration */ public String name; /** URLs to crawl */ public String urls; /** Description of the web crawling configuration */ public String description;Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Thu Jul 17 08:28:31 UTC 2025 - 1.2K bytes - Viewed (0) -
src/main/java/org/codelibs/fess/ds/callback/FileListIndexUpdateCallbackImpl.java
/** * Constructs a new crawl request. * * @param url the URL to crawl * @param depth the depth of this URL in the crawling hierarchy */ CrawlRequest(final String url, final int depth) { this.url = url; this.depth = depth; } /** * Gets the URL of this crawl request. *Registered: Sat Dec 20 09:19:18 UTC 2025 - Last Modified: Fri Nov 28 16:29:12 UTC 2025 - 29.7K bytes - Viewed (3)