- Sort Score
- Result 10 results
- Languages All
Results 1 - 2 of 2 for htmlExtractor (0.7 sec)
-
CLAUDE.md
### Key Extractors `TikaExtractor` (1000+ formats), `PdfExtractor`, `MsWordExtractor`, `MsExcelExtractor`, `MsPowerPointExtractor`, `ZipExtractor`, `HtmlExtractor`, etc. **Registration**: ```java extractorFactory.addExtractor("text/html", htmlExtractor, 2); // Weight 2 extractorFactory.addExtractor("text/html", tikaExtractor, 1); // Fallback ``` ### Helpers
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Fri Nov 28 17:31:34 UTC 2025 - 10.7K bytes - Viewed (0) -
fess-crawler-lasta/src/main/resources/crawler/extractor.xml
class="org.codelibs.fess.crawler.extractor.impl.LhaExtractor" /> <component name="textExtractor" class="org.codelibs.fess.crawler.extractor.impl.TextExtractor" /> <component name="htmlExtractor" class="org.codelibs.fess.crawler.extractor.impl.HtmlExtractor"> <property name="featureMap"> <component class="java.util.LinkedHashMap"> <postConstruct name="put"> <arg>"http://xml.org/sax/features/namespaces"</arg>
Registered: Sat Dec 20 11:21:39 UTC 2025 - Last Modified: Sun Nov 23 03:46:53 UTC 2025 - 50.1K bytes - Viewed (0)