canonical - Code Search

src/test/java/org/codelibs/fess/crawler/transformer/FessXpathTransformerTest.java

        data = "<html><head><link rel=\"canonical\" href=\"http://example.com/\"></head><body>aaa</body></html>";
        document = getDocument(data);
        value = transformer.getCanonicalUrl(responseData, document);
        assertEquals("http://example.com/", value);

        data = "<html><head><link rel=\"canonical\" href=\"http://example1.com/\"><link rel=\"canonical\" href=\"http://example2.com/\"></head><body>aaa</body></html>";

Java

- Registered: Mon May 06 08:04:11 GMT 2024

- Last Modified: Thu Feb 22 01:37:57 GMT 2024

- 38.6K bytes

- Viewed (0)

github.com/codelibs/fess

src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java

                logger.debug("Invalid Canonical Url(https->http): {} -> {}", url, canonicalUrl);
            }
            return false;
        }
        return true;
    }

    protected void putAdditionalData(final Map<String, Object> dataMap, final ResponseData responseData, final Document document) {
        // canonical
        final String canonicalUrl = getCanonicalUrl(responseData, document);

Java

- Registered: Mon May 06 08:04:11 GMT 2024

- Last Modified: Thu Feb 22 01:37:57 GMT 2024

- 41.9K bytes

- Viewed (0)

github.com/codelibs/jcifs

src/main/java/jcifs/smb/SmbResourceLocatorImpl.java

import jcifs.SmbResourceLocator;
import jcifs.internal.util.StringUtil;
import jcifs.netbios.NbtAddress;
import jcifs.netbios.UniAddress;


/**
 * 
 * 
 * This mainly tracks two locations:
 * - canonical URL path: path component of the URL: this is used to reconstruct URLs to resources and is not adjusted by
 * DFS referrals. (E.g. a resource with a DFS root's parent will still point to the DFS root not the share it's actually

Java

- Registered: Sun May 05 00:10:10 GMT 2024

- Last Modified: Sat Jul 20 08:24:53 GMT 2019

- 23.9K bytes

- Viewed (0)

github.com/codelibs/fess

src/main/resources/fess_config.properties

# html
crawler.document.html.content.xpath=//BODY
crawler.document.html.lang.xpath=//HTML/@lang
crawler.document.html.digest.xpath=//META[@name='description']/@content
crawler.document.html.canonical.xpath=//LINK[@rel='canonical'][1]/@href
crawler.document.html.pruned.tags=noscript,script,style,header,footer,aside,nav,a[rel=nofollow]
crawler.document.html.max.digest.length=120
crawler.document.html.default.lang=

Properties

- Registered: Mon May 06 08:04:11 GMT 2024

- Last Modified: Thu Apr 11 02:34:53 GMT 2024

- 30.6K bytes

- Viewed (1)

Search Options

src/test/java/org/codelibs/fess/crawler/transformer/FessXpathTransformerTest.java

src/main/java/org/codelibs/fess/crawler/transformer/FessXpathTransformer.java

src/main/java/jcifs/smb/SmbResourceLocatorImpl.java

src/main/resources/fess_config.properties