You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there administrators!
My blog in the webring uses a link format of index.php?post=20220612001238, like this, however it looks to me that the crawler doesn't like query arguments in the url:
Can there be a way for sites to hint that they may want to use ? or # separated URLs? From what I know, MDWiki is quite popular and it uses #! to specify page links, so that way we could index more pages for these sites as well.
I can see where # could post some problems with title links... I'd suggest that allow a <meta> or some sort of tag in the page head to hint the crawler that some formats of the link can be allowed, and if href regex matches the "allowed link format", the link will be preserved?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi there administrators!
My blog in the webring uses a link format of
index.php?post=20220612001238
, like this, however it looks to me that the crawler doesn't like query arguments in the url:lieu/crawler/crawler.go
Line 51 in b0ad7dc
Can there be a way for sites to hint that they may want to use
?
or#
separated URLs? From what I know, MDWiki is quite popular and it uses#!
to specify page links, so that way we could index more pages for these sites as well.I can see where
#
could post some problems with title links... I'd suggest that allow a<meta>
or some sort of tag in the page head to hint the crawler that some formats of the link can be allowed, and ifhref
regex matches the "allowed link format", the link will be preserved?Thanks!
The text was updated successfully, but these errors were encountered: