Is there a good indexing / search engine for Node.js?

Javascriptnode.jsLuceneIndexingSearch Engine

Javascript Problem Overview


I'm looking for a good open source (with LGPL or a permissive license) indexing engine for a node.js application, something like Lucene. I'm looking for in-process indexing and search and am not interested in indexing servers like Sphinx or Solr.

I am not afraid to create bindings for a C/C++ library either so I'm open to those kind of suggestions as well.

So far I've found

  • node-clucene which doesn't seem to be actively maintained anymore (and has several open issues)
  • I could create my own binding for CLucene but it seems to be quite sparsely maintained and its current version is also quite behind the Java Lucene
  • Apache Lucy which seems to be designed for the purpose of creating bindings for dynamic languages, but so far they don't have node bindings (nor a C API) and I haven't found any docs about creating bindings. I also didn't find any benchmarks about its performance.
  • node-search which seems to be abandoned
  • jsii which seems to be still a prototype and is also abandoned
  • fullproof which is only intended to run in a web broswer
  • lunr.js which seems to only allow serializing the whole index, so isn't scalable

I could "roll my own", but I'd prefer to use an already existing solution.

EDIT: Why I'm not interested in a standalone index server: I use a fast in-process key-value store database, so it'd be quite a waste having to go out of process for querying.

Javascript Solutions


Solution 1 - Javascript

Just an update to my earlier answer - since there was so much discussion I didn't want this update to get lost.

You can download it here:

Solution 2 - Javascript

Yes, check out the newly released Norch

Norch is based on the search-index module for node.js, which is in turn based on Google's powerful levelDB index.

EDIT: Use the search-index module for fast "in-process" search capability.

Solution 3 - Javascript

Can you explain why you're not interested in using an external index? For full text search I always revert to using PostgreSQL's full text indexing capabilities - it's very fast, indexing doesn't require a full-index-update (like Solr does), and results are returned faster than Lucene based solutions (such as Elastic Search).

But if you really want to do it in-process, you probably want to look at Lunr: http://lunrjs.com/ - it does work in Node, not just in the browser.

Edit: Here's where I got my stats on Postgres being faster than Lucene: http://fr.slideshare.net/billkarwin/full-text-search-in-postgresql - see Slide 49.

Edit: Not sure what kind of speed you're looking at for in/out of process, but our PostgreSQL database can do 100k queries per second without breaking a sweat, and it's not even on SSDs. Perhaps you're over-thinking your performance needs - after all once you need to go to multiple nodes (or using cluster to take advantage of all CPUs) you will need to dump in-process anyway.

Solution 4 - Javascript

Full Text Search Light, is a pure in JS written node module for doing full text searches. Here you can find the current git repository link: https://github.com/frankred/node-full-text-search-light

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVenemoView Question on Stackoverflow
Solution 1 - JavascriptFergieView Answer on Stackoverflow
Solution 2 - JavascriptFergieView Answer on Stackoverflow
Solution 3 - JavascriptMatt SergeantView Answer on Stackoverflow
Solution 4 - JavascriptFrank RothView Answer on Stackoverflow