Search Options

Display Count
Sort
Preferred Language
Advanced Search

Results 1 - 10 of 17 for extraction (0.08 seconds)

  1. CLAUDE.md

    **Fess Crawler** is a Java-based web crawling framework for enterprise content extraction.
    
    ### Essential Info
    
    - **Language**: Java 21+
    - **Build**: Maven 3.x
    - **License**: Apache 2.0
    - **DI**: LastaFlute DI
    - **Repo**: https://github.com/codelibs/fess-crawler
    
    ### Tech Stack
    
    - **HTTP**: Apache HttpComponents 4.5+ and 5.x (switchable)
    - **Extraction**: Apache Tika, POI, PDFBox
    Created: Sun Apr 12 03:50:13 GMT 2026
    - Last Modified: Thu Mar 12 03:39:20 GMT 2026
    - 8.1K bytes
    - Click Count (0)
  2. src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java

            }
            return new URL(currentUrl);
        }
    
        /**
         * Gets child URL extraction rules from configuration.
         *
         * @param responseData the response data from crawling
         * @param resultData the result data
         * @return stream of tag-attribute pairs for URL extraction
         */
        @Override
    Created: Tue Mar 31 13:07:34 GMT 2026
    - Last Modified: Thu Mar 12 01:46:45 GMT 2026
    - 55.3K bytes
    - Click Count (0)
  3. src/main/java/org/codelibs/fess/helper/DocumentHelper.java

    import org.codelibs.fess.crawler.exception.CrawlerSystemException;
    import org.codelibs.fess.crawler.exception.CrawlingAccessException;
    import org.codelibs.fess.crawler.extractor.Extractor;
    import org.codelibs.fess.crawler.extractor.impl.TikaExtractor;
    import org.codelibs.fess.crawler.processor.ResponseProcessor;
    import org.codelibs.fess.crawler.processor.impl.DefaultResponseProcessor;
    import org.codelibs.fess.crawler.rule.Rule;
    Created: Tue Mar 31 13:07:34 GMT 2026
    - Last Modified: Mon Mar 30 14:27:04 GMT 2026
    - 17.4K bytes
    - Click Count (0)
  4. android/guava/src/com/google/common/collect/FluentIterable.java

     *
     * <ul>
     *   <li>chaining methods which return a new {@code FluentIterable} based in some way on the
     *       contents of the current one (for example {@link #transform})
     *   <li>element extraction methods which facilitate the retrieval of certain elements (for example
     *       {@link #last})
     *   <li>query methods which answer questions about the {@code FluentIterable}'s contents (for
     *       example {@link #anyMatch})
    Created: Fri Apr 03 12:43:13 GMT 2026
    - Last Modified: Thu Apr 02 14:49:41 GMT 2026
    - 34.7K bytes
    - Click Count (0)
  5. guava/src/com/google/common/collect/FluentIterable.java

     *
     * <ul>
     *   <li>chaining methods which return a new {@code FluentIterable} based in some way on the
     *       contents of the current one (for example {@link #transform})
     *   <li>element extraction methods which facilitate the retrieval of certain elements (for example
     *       {@link #last})
     *   <li>query methods which answer questions about the {@code FluentIterable}'s contents (for
     *       example {@link #anyMatch})
    Created: Fri Apr 03 12:43:13 GMT 2026
    - Last Modified: Thu Apr 02 14:49:41 GMT 2026
    - 34.7K bytes
    - Click Count (0)
  6. src/main/java/org/codelibs/fess/llm/AbstractLlmClient.java

                }
                return extractJsonStringFallback(json, key);
            }
            return "";
        }
    
        /**
         * Fallback regex-based extraction for string values.
         *
         * @param json the JSON response
         * @param key the key to extract
         * @return the extracted string value
         */
    Created: Tue Mar 31 13:07:34 GMT 2026
    - Last Modified: Sat Mar 21 06:04:58 GMT 2026
    - 72K bytes
    - Click Count (0)
  7. docs/en/docs/release-notes.md

    * Fix broken link in docs about OAuth 2.0 with scopes. PR [#275](https://github.com/tiangolo/fastapi/pull/275) by [@dmontagu](https://github.com/dmontagu).
    
    * Refactor param extraction using Pydantic `Field`:
        * Large refactor, improvement, and simplification of param extraction from *path operations*.
    Created: Sun Apr 05 07:19:11 GMT 2026
    - Last Modified: Fri Apr 03 12:07:04 GMT 2026
    - 631K bytes
    - Click Count (0)
  8. .teamcity/test-buckets.json

          {
            "subprojects": [
              "concurrent",
              "daemon-protocol",
              "daemon-server-worker",
              "functional",
              "internal-instrumentation-api",
              "java-api-extractor",
              "java-compiler-plugin",
              "javadoc",
              "kotlin-dsl-integ-tests",
              "kotlin-dsl-plugins",
              "normalization-java"
            ],
            "parallelizationMethod": {
    Created: Wed Apr 01 11:36:16 GMT 2026
    - Last Modified: Mon Mar 23 18:38:15 GMT 2026
    - 118.6K bytes
    - Click Count (0)
  9. src/main/java/org/codelibs/fess/util/ComponentUtil.java

         */
        public static IntervalControlHelper getIntervalControlHelper() {
            return getComponent(INTERVAL_CONTROL_HELPER);
        }
    
        /**
         * Gets the extractor factory component.
         * @return The extractor factory.
         */
        public static ExtractorFactory getExtractorFactory() {
            return getComponent(EXTRACTOR_FACTORY);
        }
    
        /**
         * Gets a job executor by name.
    Created: Tue Mar 31 13:07:34 GMT 2026
    - Last Modified: Sat Mar 28 06:59:19 GMT 2026
    - 30.9K bytes
    - Click Count (0)
  10. .teamcity/subprojects.json

        "path": "platforms/jvm/jacoco-workers",
        "unitTests": false,
        "functionalTests": false,
        "crossVersionTests": false
      },
      {
        "name": "java-api-extractor",
        "path": "platforms/core-configuration/java-api-extractor",
        "unitTests": true,
        "functionalTests": false,
        "crossVersionTests": false
      },
      {
        "name": "java-compiler-plugin",
    Created: Wed Apr 01 11:36:16 GMT 2026
    - Last Modified: Fri Mar 27 15:03:00 GMT 2026
    - 42K bytes
    - Click Count (0)
Back to Top