Search Options

Results per page
Sort
Preferred Languages
Advance

Results 1 - 10 of 49 for from (0.01 sec)

  1. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/PdfExtractor.java

    /**
     * PdfExtractor extracts text content from PDF files using Apache PDFBox.
     * It supports password-protected PDFs and can extract embedded documents and annotations.
     *
     * <p>The extractor runs text extraction in a separate thread with a configurable timeout
     * to prevent hanging on problematic PDF files. It also extracts metadata from the PDF
     * document and includes it in the extraction result.
     *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 12.7K bytes
    - Viewed (0)
  2. src/main/java/org/codelibs/fess/suggest/settings/ArraySettings.java

     * <li>{@link #addToArrayIndex(String, String, String, Map)}: Adds a map to the array index.</li>
     * <li>{@link #deleteKeyFromArray(String, String, String)}: Deletes all entries associated with the specified key from the array index.</li>
     * <li>{@link #deleteFromArray(String, String, String)}: Deletes a specific entry from the array index based on the ID.</li>
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Thu Aug 07 02:41:28 UTC 2025
    - 15.6K bytes
    - Viewed (0)
  3. fess-crawler-opensearch/src/main/java/org/codelibs/fess/crawler/service/impl/AbstractCrawlerService.java

                        builder.addSort(sortBuilder);
                    }
                }
                if (from != null) {
                    builder.setFrom(from);
                }
                if (size != null) {
                    builder.setSize(size);
                }
            });
        }
    
        /**
         * Retrieves a list of documents from the OpenSearch index using a custom search request builder.
         *
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 34.2K bytes
    - Viewed (0)
  4. src/main/java/org/codelibs/fess/suggest/util/SuggestUtil.java

        }
    
        /**
         * Extracts keywords from the given query string based on the specified fields.
         *
         * @param q the query string to parse and extract keywords from
         * @param fields the fields to consider when extracting keywords
         * @return a list of unique keywords extracted from the query string
         */
    Registered: Fri Sep 19 09:08:11 UTC 2025
    - Last Modified: Mon Sep 01 13:33:03 UTC 2025
    - 17.4K bytes
    - Viewed (1)
  5. fess-crawler/src/main/java/org/codelibs/fess/crawler/entity/ResponseData.java

        /**
         * Gets the set of child URLs discovered from this response.
         *
         * @return the set of child URLs
         */
        public Set<RequestData> getChildUrlSet() {
            return childUrlSet;
        }
    
        /**
         * Creates a RequestData object from this response's URL and method.
         *
         * @return a new RequestData object with the URL and method from this response
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 11.6K bytes
    - Viewed (0)
  6. fess-crawler/src/main/java/org/codelibs/fess/crawler/extractor/impl/HtmlXpathExtractor.java

     * It uses XPath expressions to extract text content from HTML documents.
     * <p>
     * This class provides methods to configure the XPath expressions, parser features, and properties.
     * It also includes caching mechanism for XPathAPI instances to improve performance.
     * </p>
     * <p>
     * The extracted text is obtained from the nodes selected by the {@code targetNodePath} XPath expression.
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.3K bytes
    - Viewed (0)
  7. fess-crawler/src/main/java/org/codelibs/fess/crawler/transformer/impl/HtmlTransformer.java

            }
            return null;
        }
    
        /**
         * Extracts URLs from HTML tag attributes using XPath.
         *
         * @param url the base URL for resolving relative URLs
         * @param document the document to extract URLs from
         * @param xpath the XPath expression to select elements
         * @param attr the attribute name to extract URLs from
         * @param encoding the character encoding to use
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 28.5K bytes
    - Viewed (0)
  8. fess-crawler/src/main/java/org/codelibs/fess/crawler/CrawlerThread.java

     * </p>
     *
     * <p>
     * The crawling process involves the following steps:
     * </p>
     * <ol>
     *   <li>Fetching a URL from the queue using {@link UrlQueueService#poll(String)}.</li>
     *   <li>Checking if the URL is valid using {@link #isValid(UrlQueue)}.</li>
     *   <li>Accessing the content using a {@link CrawlerClient} obtained from {@link CrawlerClientFactory}.</li>
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Thu Aug 07 02:55:08 UTC 2025
    - 20.4K bytes
    - Viewed (0)
  9. fess-crawler/src/main/java/org/codelibs/fess/net/protocol/storage/Handler.java

         * This class handles the authentication, connection management, and data retrieval
         * from storage buckets and objects.
         *
         * <p>
         * The connection extracts bucket and object names from the URL and uses environment
         * variables for authentication and endpoint configuration.
         * </p>
         */
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 10.5K bytes
    - Viewed (0)
  10. fess-crawler/src/main/java/org/codelibs/fess/crawler/client/ftp/FtpClient.java

            return processRequest(uri, true);
        }
    
        /**
         * Processes an FTP request to retrieve data from the specified URI.
         * This method handles the complete FTP request lifecycle including timeout management,
         * connection setup, and data retrieval.
         *
         * @param uri The URI to retrieve data from
         * @param includeContent Whether to include the actual content in the response
    Registered: Sun Sep 21 03:50:09 UTC 2025
    - Last Modified: Sun Jul 06 02:13:03 UTC 2025
    - 39.5K bytes
    - Viewed (0)
Back to top