4

JavaScript 上的 fuzzy search library

 1 year ago
source link: https://blog.gslin.org/archives/2022/10/02/10894/javascript-%e4%b8%8a%e7%9a%84-fuzzy-search-library/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

JavaScript 上的 fuzzy search library

Hacker News Daily 上看到 Show HN (作者自己或是主要的 contributor 上來發表的作品) 給了一個號稱速度很快,吃資源很少的 fuzzy search library:「Show HN: uFuzzy.js – A tiny, efficient fuzzy search that doesn't suck (github.com/leeoniya)」。

這種已經發展許久,但突然有一天有人說他的東西超好超棒棒的,除非是有新的基礎演算法突破,不然馬上就會想到很經典的「Three circles model」,中間的那些區塊就懶的畫上去了:

mc1VLjh.png

依照他的「測試」,可以看到他宣稱完全領先的狀態:

fh30DEU.png

但回過頭來看評論:

Thank you for this!

I am also quite frustrated with the current state of full text search in the javascript world. All libs I've tried miss the most basic examples and their community seems to ignore it. Will give yours a try but it already looks much better from the comparison page.

Edit: Nope, your lib doesn't seem to handle substitution well (THE most common type of typo), so yep, we are back to square one ...

From fuzzy search I expected that entering "super meet boy" or "super maet boy" will return "Super Meat Boy" but unfortunately currently it doesn't work this way and it's quite disappointing.

https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...

看起來這個 library 沒有辦法解決 fuzzy search 最常見的 case (小 typo),依照範例描述的更像是 substring 搜尋加上一些額外的的功能,反而比較像是 auto completion library,或是講的比較廣一點,可以算是 auto suggestion library。

不過我覺得真正的重點 (對我來說的重點) 是下面的比較表格,因為列出了目前市場上的方案,這份清單之後可以拿來參考...

Related

用 IPFS 放 Status Page

在 Hacker News Daily 上看到的專案,拿 IPFS 來放 Status Page 真的超適合的,靜態為主,然後又可以避開 SPoF:「Decentralized StatusPage」。程式碼放在 GitHub 的 paulogr/dstatuspage,以 JavaScript 撰寫。 在 Hacker News 上的討論也蠻有趣的:「Show HN: A Decentralized StatusPage on IPFS | Hacker News」。 IPFS 提供的特性與 Status Page 需要的特性其實還蠻契合的,也許不是 killer application,但應該會是個 IPFS 的出發點... 而且最近 Firefox 在他們的 blog 上發表了 Firefox 59 會將一些分散式協定加進去 (不是實做,而是提供接口):「Extensions in Firefox…

February 2, 2018

In "Browser"

IndexTank 的設計

IndexTank 是在 xdite 的個人板上看到的網站,號稱真正 Scalable 的 Search Engine,看了他的架構設計後看起來應該是真的 Scalable。 由於是屬於 SaaS 服務,對於 startup 不想自己做 search engine 的可以直接套上去。而對於技術人員真正有價值的是他的 API 設計文件中的 function definition syntax:「function definition syntax」,雖然故意寫個 coming soon,但實際上可以在 client library 裡看到範例:「Python client」。 先內建一些基本的數學函數,像是四則運算、power、log,並且內建一些很常用到的變數。接下來定義出來的函數可以再重複使用,不斷累積上去,最後在 query 的時候就可以 ORDER BY 某個 score... IndexTank 告訴你「利用 API 讓前端程式設定 function 以降低 denormalization 的複雜度」時要怎麼設計 API,當你自己建立 search engine 時,新增的 function…

October 15, 2010

In "Cloud"

搜尋影片的串流平台

在 Hacker News Daily 上看到「Show HN: API to query catalogs of 20 streaming services across 60 countries (movieofthenight.com)」這個,但這個服務反而不是重點,有許多人發現裡面錯誤率頗高,而且也沒有台灣的資料,反倒是裡面有人提到 JustWatch 這個服務看起來比較好用... 像是「Friends」(這邊用的是中國的翻譯片名) 可以看到在台灣是在 Netflix 上,美國的話則是在 HBO Max (串流) 與 Apple TV (購買) 上可以看到。不過查 MythBusters 在兩個平台上都沒看到資料... 但整體上來說 JustWatch 搜出來的品質還是好不少...

January 13, 2022

In "API"

a611ee8db44c8d03a20edf0bf5a71d80?s=49&d=identicon&r=gAuthor Gea-Suan LinPosted on October 2, 2022Categories Computer, Library, Murmuring, Programming, Search Engine, SoftwareTags fuzzy, javascript, library, search, string

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)

Post navigation


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK