Search Options

Results per page
Sort
Preferred Languages
Advance

Results 21 - 27 of 27 for domain (0.04 sec)

  1. fess-crawler/src/test/java/org/codelibs/fess/crawler/exception/CrawlerSystemExceptionTest.java

         */
        public void test_stackTraceWithCause() {
            Exception cause = new IllegalArgumentException("Cause exception");
            CrawlerSystemException exception = new CrawlerSystemException("Main exception", cause);
    
            StackTraceElement[] mainStackTrace = exception.getStackTrace();
            StackTraceElement[] causeStackTrace = cause.getStackTrace();
    
            assertNotNull(mainStackTrace);
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Wed Sep 03 14:42:53 UTC 2025
    - 20K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/Crawler.java

    import org.codelibs.fess.crawler.service.DataService;
    import org.codelibs.fess.crawler.service.UrlQueueService;
    
    import jakarta.annotation.Resource;
    
    /**
     * The Crawler class is the main class for web crawling. It manages the crawling process,
     * including adding URLs to the queue, filtering URLs, managing crawler threads,
     * and handling the overall crawling lifecycle.
     *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 14K bytes
    - Viewed (0)
  3. LICENSE

          editorial revisions, annotations, elaborations, or other modifications
          represent, as a whole, an original work of authorship. For the purposes
          of this License, Derivative Works shall not include works that remain
          separable from, or merely link (or bind by name) to the interfaces of,
          the Work and Derivative Works thereof.
    
          "Contribution" shall mean any work of authorship, including
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Mon Jan 11 04:30:09 UTC 2021
    - 11.1K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/TextUtil.java

    /**
     * Utility class for text normalization and processing.
     *
     * This class provides methods to normalize text by reading characters from a provided Reader
     * and processing them according to specific rules. The main functionality is encapsulated
     * within the nested {@link TextNormalizeContext} class.
     *
     * <p>The text normalization process includes:
     * <ul>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 12K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/FileTransformer.java

     * {@link org.codelibs.fess.crawler.exception.CrawlerSystemException} in case of errors.
     * </p>
     *
     * <p>
     * The {@link #storeData(ResponseData, ResultData)} method is the main entry point for storing
     * the content of a crawled resource. The {@link #getData(AccessResultData)} method retrieves
     * the stored file path as a File object.
     * </p>
     */
    public class FileTransformer extends HtmlTransformer {
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 11.7K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

     *   <li><b>invalidUrlPattern:</b> A regular expression pattern used to identify
     *       invalid URLs.</li>
     * </ul>
     *
     * <p>
     * <b>Usage:</b>
     * </p>
     * <p>
     * The {@code transform} method is the main entry point for transforming an HTML
     * response. It takes a {@link ResponseData} object as input and returns a
     * {@link ResultData} object containing the extracted data and child URLs.
     * </p>
     */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 28.5K bytes
    - Viewed (0)
  7. fess-crawler/src/main/resources/org/codelibs/fess/crawler/mime/tika-mimetypes.xml

      Notes:
       * Tika supports a wider range of match types than Freedesktop does
       * Glob patterns must be unique, if there's a clash assign to the most
         popular format
       * The main mime type should be the canonical one, use aliases for any
         other widely used forms
       * Where there's a hierarchy in the types, list it via a parent
       * Highly specific magic matches get a high priority
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Mar 13 08:18:01 UTC 2025
    - 320.1K bytes
    - Viewed (1)
Back to top