Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 78 for specifies (0.03 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/RobotsTxt.java

        /**
         * Gets the crawl delay value for the specified user agent from robots.txt.
         * The crawl delay specifies the time (in seconds) to wait between successive requests.
         *
         * @param userAgent The user agent string to match against robots.txt directives
         * @return The crawl delay value in seconds. Returns 0 if no matching directive is found
         *         or no crawl delay is specified for the matching directive.
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10K bytes
    - Viewed (0)
  2. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java

     *
     */
    public class HtmlXpathExtractor extends AbstractXmlExtractor {
        /**
         * Regular expression pattern to match the charset attribute in the meta tag of HTML documents.
         * The pattern captures the charset value specified in the content attribute of the meta tag.
         * Example: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.3K bytes
    - Viewed (0)
  3. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/JodExtractor.java

        }
    
        /**
         * Gets the extractor for the specified file extension.
         *
         * @param ext the file extension
         * @return the extractor for the extension, or null if not found
         */
        private Extractor getExtractor(final String ext) {
            return extractorMap.get(ext);
        }
    
        /**
         * Gets the output extension for the specified input extension.
         *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.3K bytes
    - Viewed (0)
  4. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/ResponseData.java

            responseBodyFile = responseBody;
            isTemporaryFile = isTemporary;
        }
    
        /**
         * Gets the character set of the response.
         *
         * @return the character set, or null if not specified
         */
        public String getCharSet() {
            return charSet;
        }
    
        /**
         * Sets the character set of the response.
         *
         * @param charSet the character set to set
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 11.6K bytes
    - Viewed (0)
  5. fess-crawler-opensearch/src/main/java/org/codelibs/fess/crawler/service/impl/OpenSearchUrlQueueService.java

            }
        }
    
        /**
         * Deletes all URL queue entries for the specified session.
         *
         * @param sessionId The session ID.
         */
        @Override
        public void delete(final String sessionId) {
            deleteBySessionId(sessionId);
        }
    
        /**
         * Offers multiple URL queue entries for the specified session.
         * Only URLs that don't already exist will be added.
         *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 17K bytes
    - Viewed (1)
  6. src/main/java/org/codelibs/fess/suggest/util/SuggestUtil.java

        /**
         * Extracts a list of TermQuery objects from the given Query object that match the specified fields.
         *
         * @param query the Query object to extract TermQuery objects from
         * @param fields an array of field names to match against the TermQuery objects
         * @return a list of TermQuery objects that match the specified fields, or an empty list if no matches are found
         */
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Mon Sep 01 13:33:03 UTC 2025
    - 17.4K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

     *   <li>Extracting child URLs from the HTML content based on configured rules.</li>
     *   <li>Handling redirect URLs specified in the response headers.</li>
     * </ol>
     * <p>
     * The class also provides methods for configuring features and properties of the
     * underlying DOM parser, as well as defining rules for extracting child URLs
     * from specific HTML tags and attributes.
     * </p>
     *
     * <p>
     * <b>Configuration:</b>
     * </p>
     * <ul>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 28.5K bytes
    - Viewed (0)
  8. fess-crawler/src/main/java/org/codelibs/fess/crawler/util/TextUtil.java

     * and processing them according to specific rules. The main functionality is encapsulated
     * within the nested {@link TextNormalizeContext} class.
     *
     * <p>The text normalization process includes:
     * <ul>
     *   <li>Treating ISO control characters and specified space characters as spaces.</li>
     *   <li>Appending alphanumeric characters (0-9, A-Z, a-z) to the buffer.</li>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 12K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/storage/StorageClient.java

                }
            }
            throw new CrawlingAccessException("Invalid path: " + path);
        }
    
        /**
         * Retrieves response data for the specified URI.
         * @param uri the URI to retrieve data for
         * @param includeContent whether to include the actual content in the response
         * @return the response data containing metadata and optionally content
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 17.9K bytes
    - Viewed (2)
  10. fess-crawler/src/main/java/org/codelibs/fess/net/protocol/storage/Handler.java

            private String objectName;
            /** Cached object statistics response */
            private StatObjectResponse statObject;
    
            /**
             * Constructs a new StorageURLConnection for the specified URL.
             * This constructor parses the URL to extract bucket and object names.
             *
             * @param url The storage URL to connect to
             */
            protected StorageURLConnection(final URL url) {
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.5K bytes
    - Viewed (0)
Back to top