Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 5 of 5 for crawl (0.05 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/helper/RobotsTxtHelper.java

    import org.codelibs.fess.crawler.exception.RobotsTxtException;
    
    /**
     * Robots.txt Specifications:
     * <ul>
     * <li><a href=
     * "https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt"
     * >https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
     * </a></li>
     * </ul>
     *
     * @author bowez
     * @author shinsuke
     *
     */
    public class RobotsTxtHelper {
    
    Registered: Sun Nov 10 03:50:12 UTC 2024
    - Last Modified: Sat Oct 12 01:40:57 UTC 2024
    - 6.1K bytes
    - Viewed (0)
  2. src/main/java/org/codelibs/fess/exec/Crawler.java

                    webFsCrawlerThread = new Thread((Runnable) () -> {
                        // crawl web
                        writeTimeToSessionInfo(crawlingInfoHelper, Constants.WEB_FS_CRAWLER_START_TIME);
                        webFsIndexHelper.crawl(options.sessionId, webConfigIdList, fileConfigIdList);
                        writeTimeToSessionInfo(crawlingInfoHelper, Constants.WEB_FS_CRAWLER_END_TIME);
    Registered: Thu Oct 31 13:40:30 UTC 2024
    - Last Modified: Fri Oct 11 21:20:39 UTC 2024
    - 24K bytes
    - Viewed (0)
  3. src/main/java/org/codelibs/fess/helper/WebFsIndexHelper.java

        protected int crawlerPriority = Thread.NORM_PRIORITY;
    
        protected final List<Crawler> crawlerList = Collections.synchronizedList(new ArrayList<>());
    
        public void crawl(final String sessionId, final List<String> webConfigIdList, final List<String> fileConfigIdList) {
            final boolean runAll = webConfigIdList == null && fileConfigIdList == null;
            final List<WebConfig> webConfigList;
    Registered: Thu Oct 31 13:40:30 UTC 2024
    - Last Modified: Fri Oct 11 21:11:58 UTC 2024
    - 22.6K bytes
    - Viewed (0)
  4. README.md

    Fess also contains a Crawler, which can crawl documents on a [web server](https://fess.codelibs.org/14.17/admin/webconfig-guide.html), [file system](https://fess.codelibs.org/14.17/admin/fileconfig-guide.html), or [Data Store](https://fess.codelibs.org/14.17/admin/dataconfig-guide.html) (such...
    Registered: Thu Oct 31 13:40:30 UTC 2024
    - Last Modified: Sat Oct 12 07:19:47 UTC 2024
    - 7.3K bytes
    - Viewed (0)
  5. guava/src/com/google/common/util/concurrent/AbstractScheduledService.java

     *     toCrawl = readStartingUris();
     *   }
     *
     *   protected void runOneIteration() throws Exception {
     *     Uri uri = toCrawl.remove();
     *     Collection<Uri> newUris = crawl(uri);
     *     visited.add(uri);
     *     for (Uri newUri : newUris) {
     *       if (!visited.contains(newUri)) { toCrawl.add(newUri); }
     *     }
     *   }
     *
     *   protected void shutDown() throws Exception {
    Registered: Fri Nov 01 12:43:10 UTC 2024
    - Last Modified: Fri Oct 25 16:22:21 UTC 2024
    - 27.8K bytes
    - Viewed (0)
Back to top