Search Options

Results per page
Sort
Preferred Languages
Advance

Results 11 - 19 of 19 for define (0.03 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/filter/impl/UrlFilterImpl.java

     * The class supports caching of include and exclude patterns for scenarios where a session ID is not available.
     * It also provides methods to initialize the filter with a session ID, clear the filter,
     * match a URL against the defined patterns, and process a URL to add include or exclude patterns based on predefined filtering patterns.
     *
     */
    /**
     * This class is an implementation of a URL filter.
     */
    public class UrlFilterImpl implements UrlFilter {
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 9.2K bytes
    - Viewed (0)
  2. fess-crawler/src/test/resources/extractor/eml/sample4.eml

     Fahrt nach Baruth / Glashütte
    für ausländische Studierende
    Das Team für Betreuung ausländischer Studierender führt 
    am
    Mittwoch, dem 22.06.2016
    eine Exkursion nach Baruth / Glashütte durch.
    Der Präsident
    Abteilung I -
    Studierendenservice
    Akademisches Auslandsamt
    Betreuung Internationaler Studierender
    Sekretariat ID 42
    Straße des 17. Juni 135
    10623 Berlin
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jan 07 09:15:11 UTC 2018
    - 681K bytes
    - Viewed (0)
  3. LICENSE

       TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
    
       1. Definitions.
    
          "License" shall mean the terms and conditions for use, reproduction,
          and distribution as defined by Sections 1 through 9 of this document.
    
          "Licensor" shall mean the copyright owner or entity authorized by
          the copyright owner that is granting the License.
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Mon Jan 11 04:26:17 UTC 2021
    - 11.1K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/SitemapUrl.java

     */
    package org.codelibs.fess.crawler.entity;
    
    import org.codelibs.core.lang.StringUtil;
    
    /**
     * Represents a URL entry within a sitemap.
     *
     * <p>
     * This class encapsulates the properties of a URL as defined in the sitemap XML format,
     * including its location, last modification date, change frequency, and priority.
     * It implements the {@link Sitemap} interface.
     * </p>
     *
     * <p>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 6.5K bytes
    - Viewed (0)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/AbstractCrawlerClient.java

    /**
     * Abstract base class for CrawlerClient implementations.
     * Provides common functionality for handling initialization parameters,
     * content length checks, and default method implementations.
     * It defines the basic structure and configuration options for crawler clients.
     */
    public abstract class AbstractCrawlerClient implements CrawlerClient {
    
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Sep 06 04:15:37 UTC 2025
    - 9.7K bytes
    - Viewed (10)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/XpathTransformer.java

    import org.xml.sax.InputSource;
    
    /**
     * {@link XpathTransformer} is a class that transforms HTML content into XML format based on XPath expressions.
     * It extracts data from an HTML document by applying XPath rules defined in {@link #fieldRuleMap}.
     * The extracted data is then formatted into an XML structure and stored in the {@link ResultData}.
     * <p>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 13.1K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java

    import java.util.List;
    import java.util.Map;
    import java.util.regex.Pattern;
    
    import org.codelibs.core.lang.StringUtil;
    
    /**
     * Represents a robots.txt file parser and handler.
     * This class manages the rules defined in a robots.txt file, including user agent directives,
     * allowed/disallowed paths, crawl delays, and sitemap URLs.
     *
     * <p>The robots.txt protocol is implemented according to the standard specification,
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10K bytes
    - Viewed (0)
  8. fess-crawler/src/main/resources/org/codelibs/fess/crawler/mime/tika-mimetypes.xml

      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->
    <!--
      Description: This xml file defines the valid mime types used by Tika.
      The mime type data within this file is based on information from various
      sources like Apache Nutch, Apache HTTP Server, the file(1) command, etc.
    
      Notes:
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Mar 13 08:18:01 UTC 2025
    - 320.1K bytes
    - Viewed (2)
  9. fess-crawler/src/test/resources/extractor/eml/sample2.eml

    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, minimum-scale=1.0, maximum-scale=1.0, user-scalable=0" />
    <meta name="apple-mobile-web-app-capable" content="yes" />
    <style type="text/css">
    
    @media only screen and (max-width: 420px) {
    a[class="article-headline"] {
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sat Jan 16 07:50:35 UTC 2016
    - 91.6K bytes
    - Viewed (0)
Back to top