-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chat Transcript Crawling No Longer Possible #25
Comments
I've seen that since some time searching for Imgur isn't reflected any longer for URLs in the chat log. Which is sort of pity. I suspect that Stackoverflows indexing of the chat is skipping now ccertain parts to keep the index small. It's perhaps worth to report this as problematic on SO? I for myself think it's a pitty that it's not possible any longer to search for URLs. Probably the reason is similar why searching for keywords doesn't work any longer? |
I investigated this with our chat devs and we have determined that we never supported tag searches in chat specifically. This bot worked by happenstance and the search query that it reads has now been polluted with mentions of the tags that aren't actually tags. If you'd like a tag search feature in chat, I'd suggest posting a meta feature request. |
If the response is "no" to this I will be removing chat support in the next release. We can always cherry-pick the code out from the last commit and re-add it later if they decide on supporting tag searching. |
You can always make a crawler that navigates through transcript pages... |
@WesNetmo with the time limit imposed for querying each transcript page, and that you would have very little control in filtering out junk messages it can take a looong time to crawl the pages. Current dev version supports setting number of entries to find and process, defaults to 250. Let's say that with current search abilities we can achieve having 10 legit results per page, it would still take 25 page requests, each with their ~10sec time throttle, and this is to be done every 15mins by default. This will make each cache update taking over 4mins. Beforehand we only needed to access 3-4 pages. |
Search keywords like
tag:cv-pls tag:delv-pls tag:flag-pls tag:reopen-pls tag:ro-pls tag:rov-pls tag:review-pls tag:rv-pls
are no longer able to be searched via the chat transcript search form. I don't know when this change happened. I can't find a reliable way to search the transcript for any close vote tags. If this does not get fixed, or someone does not have a workaround then chat support will have to be dropped for the backlog.The text was updated successfully, but these errors were encountered: