One of the most thing we do on the internet is searching for something, everyone has their own "strategy" when they want to search. In my case it's:
- Search on Google :))
- Check for answer in the first page's result, if not move to the second page.
- Check for answer in the second page's result.
- If I still don't found any useful answer in second page of Google search's result, move to Bing and do the same thing I did in Google search.
- If still not found, move to another search engine such as Yandex, Ask, Duckduckgo,...
Of course, everyone (maybe by region) has their own favorite search engine. Due to that is your favorite search engine, it works really well and suits with your task. But in some cases, it may not. Because it's search engine for everybody, so in some cases, the returned value is not the result you want but the most common result.
For example, I tried to search about yasuhiro matsumoto
(@mattn - one of my idol on Github) but it returned yukihiro matsumoto
(ruby's creator) at the first rank of results.
When I tried on Bing, the result are worse.
And the return from Ask seems better but still does not give what I want at first rank.
So I add a simple filter: pull all the other search engine's results and merge it by my defined rule (for example, get all cross matched urls - urls that appeared on two or more search engine's results would be a trusted urls).
I created a simple program implement above idea, it's called Boogeyman
- a console application (I'm a fan of cli tool). Source code on github and binary executable file available here.
All strategies are defined in domain/ranker.go
. For now, there're 3 strategies:
- Top: Return top result of each search engines.
- Cross Matching: Return matched results cross through multi search engines. (Appeared in 2 or more search engines)
- All (with limit 20): Return all urls from search engines.
Feel free to implement your owned search strategy or just add new collector for collecting results from your favorite search engine.
The new collector must implement Collector
interface
// file: adapter/persistent/service/collector.go
type Collector interface {
GetSearchEngineType() domain.SearchEngineType
Query(keyword domain.Keyword) (*domain.SearchEngine, error)
}
Below is the result of searching with yasuhiro matsumoto
keyword on Boogeyman under cross match strategy.
Have fun with Boogeyman :))
Top comments (3)
I would have queried "yasuhiro matsumoto github" in the first place. My thinking would be that there are many of them and I need to give google a little nudge.
Yep, I don't blame google search due to it still my favorite search engine. Just a way to add an extra filter to get what I want faster. Search with effective keywords is one of them.