-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New type of harvesting peers #1522
Comments
Are you saying that instead of one big application, we divide the app in independent micro services? If yes, that will be really good. Since it'll be much easier to maintain them separately and separation of concens, easy deployment. |
or we can just add option to disable pushing |
That would work too @vibhcool. But going for the above solution would provide more benefits, and even if we disable some features, we have to eventually move after some time. |
@singhpratyush @vibhcool @AnshulMalik Yes, we had discussed something like that some time ago, but did not have the resources to follow up. The pro I see with simply disabling features for peers in order to achieve the desired outcome is that we maintain the same loklak server and only the configuration is different. Alternatively a solution with microservices sounds very good too. How would the setup be in that case? What components would we need? Would we split the loklak server? What path do you guys prefer? |
We can break Loklak up to a bunch of little replaceable services. Mainly the harvester, the collector, and the search indexer. |
The harvester collects tweets and push it to the collector. Collector is in charge keeping the tweets and may be do some other methods of gathering tweets (something like P2P would be nice) |
And the search indexer can handle all of the elastic work |
My point here is having a type harvesting peers which don't use ES index and hence takes up fewer resources. The motivation behind micro services is reducing the complexity of the project. But if the micro services intended are standalone, i.e. harvester/server/collector can run without indexer, then such setup would be nice. Otherwise, I was just referring to something like a |
We can take out the harvestor(Scraper) from loklak, so that we can have any number of lightweight scrapers An additional entity as mentioned by @yukiisbored , For elasticsearch, I think we already have option to use another cluster in the config Now the loklak job gets reduced to serving api requests. |
Problem
The loklak server requires a lot of resources to run properly. There are two main reasons for this -
Due to this, we commit a lot of resources for the peers which are just meant for collecting more data.
Proposed Solution
The idea is to have a loklak wok like project which can be deployed to the cloud with very low resource utilization.
It would also provide a basic search feature, without any ES index (direct scraping) at similar endpoint with similar parameters (whichever applicable) -
/api/search.json
.Advantages
Such peer will have following advantages over other wok peers -
Deploying such peers in place of loklak server peers would be more beneficial if one wishes to just collect data from it.
The text was updated successfully, but these errors were encountered: