- Sort Score
- Result 10 results
- Languages All
Results 1 - 8 of 8 for validation (0.04 sec)
-
fess-crawler/src/main/java/org/codelibs/fess/crawler/rule/impl/SitemapsRule.java
* The rule checks if the URL matches the defined regex pattern and then validates the content as a sitemap. * If any exception occurs during the sitemap validation, it logs the error and returns false. * */ public class SitemapsRule extends RegexRule { /** * Serial version UID for serialization. */ private static final long serialVersionUID = 1L;Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 2.6K bytes - Viewed (0) -
README.md
- **Extractors**: Content extraction from various formats - **Transformers**: Data transformation and enrichment - **Filters**: URL filtering with regex patterns - **Rules**: Content processing rules and validation ## Building and Testing ### Build Commands ```bash # Build all modules mvn clean install # Build without tests mvn clean install -DskipTests # Build specific module
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Aug 31 05:32:52 UTC 2025 - 15.3K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/storage/StorageClient.java
* * <p>Features: * <ul> * <li>Automatic initialization of MinIO client</li> * <li>Support for HEAD and GET operations</li> * <li>Content length validation</li> * <li>MIME type detection</li> * <li>Handling of large files through temporary file storage</li> * <li>Object metadata and tags retrieval</li> * <li>Directory listing capabilities</li> * </ul> *
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 17.9K bytes - Viewed (2) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/ExtractorBuilder.java
* for configuring the extraction parameters and handling the underlying complexities of content processing, * such as MIME type detection, extractor selection, and content length validation. * </p> * * <p> * Example usage: * </p> * * <pre> * {@code * try (InputStream in = new FileInputStream("example.pdf")) {
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 10.1K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/XmlTransformer.java
} /** * Returns the validating. * * @return the validating */ public boolean isValidating() { return validating; } /** * Sets the validating. * * @param validating the validating to set */ public void setValidating(final boolean validating) { this.validating = validating; } /**Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 23.9K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java
logger.warn("Could not get urls: (" + xpath + ", " + attr + ")", e); } return urlList; } /** * Adds a child URL to the URL list after processing and validation. * * @param urlList the list to add the URL to * @param url the base URL for resolving relative URLs * @param attrValue the attribute value containing the URL
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 28.5K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/client/http/HcHttpClient.java
* <li>COOKIES_PROPERTY: Cookie settings.</li> * <li>AUTH_SCHEME_PROVIDERS_PROPERTY: Authentication scheme providers.</li> * <li>IGNORE_SSL_CERTIFICATE_PROPERTY: Ignore SSL certificate validation.</li> * <li>DEFAULT_MAX_CONNECTION_PER_ROUTE_PROPERTY: Default maximum connections per route.</li> * <li>MAX_TOTAL_CONNECTION_PROPERTY: Maximum total connections.</li>
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Thu Aug 07 02:55:08 UTC 2025 - 52.2K bytes - Viewed (0) -
fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/SitemapsHelper.java
import org.xml.sax.Attributes; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.helpers.DefaultHandler; /** * Helper class for parsing and validating sitemaps. * It supports XML sitemaps, XML sitemap indexes, and text sitemaps, * and can handle GZIP compressed sitemaps. * The class provides methods to check if an input stream is a valid sitemap,
Registered: Sun Sep 21 03:50:09 UTC 2025 - Last Modified: Sun Jul 06 02:13:03 UTC 2025 - 14.7K bytes - Viewed (0)