[HN Gopher] USearch: Smaller and faster single-file vector searc...
       ___________________________________________________________________
        
       USearch: Smaller and faster single-file vector search engine
        
       Author : 0xedb
       Score  : 107 points
       Date   : 2023-07-31 14:25 UTC (8 hours ago)
        
 (HTM) web link (unum-cloud.github.io)
 (TXT) w3m dump (unum-cloud.github.io)
        
       | twelfthnight wrote:
       | Are folks typically using HNSW for vector search these days? I
       | thought maybe ScaNN has proven to be better? Especially since
       | it's available in FAISS [2].
       | 
       | [1] https://ai.googleblog.com/2020/07/announcing-scann-
       | efficient... [2]
       | https://github.com/facebookresearch/faiss/wiki/Fast-accumula...
        
         | ashvardanian wrote:
         | Depends... I have a beef with all methods based on "trained
         | quantization". It introduces too much noise in your
         | distribution, suffers from drifts, and makes the method mostly
         | inapplicable for other forms of "Similarity Search" that don't
         | strictly fall into the "Vector Search" category.
         | 
         | Many disagree. Pick whatever rocks your boat, there is a FOSS
         | library for almost everything these days :)
        
           | twelfthnight wrote:
           | Ah, those are interesting considerations. I don't have a
           | horse in that race, I just had to implement a similarity
           | search algorithm a few years ago and it was surprisingly
           | difficult to find a consensus on what ANN algo to use!
        
         | smeeth wrote:
         | Yeah, SPANN has better f1+queries per second on some
         | benchmarks, but that's a little like comparing sorting
         | algorithms, they're both fast and good.
         | 
         | The database software behind the ANN algo is probably a little
         | more important in practice than the ANN algo itself, unless
         | you're operating at such scale and speed that its an actual
         | issue (e.g. you're google).
         | 
         | Differences between algorithms are a little more interesting
         | when they let you do something totally different, like, for
         | example, minimize the speed hit from doing searches on disk
         | (SPTAG, DiskANN).
        
       | KRAKRISMOTT wrote:
       | What's performance like without BLAS acceleration?
        
         | ashvardanian wrote:
         | We don't use BLAS. Why? BLAS helps with matrix-matrix
         | multiplications, if you feel lazy and don't want to write the
         | matrix tiling code manually.
         | 
         | They bring essentially nothing of value in vector-vector
         | operations, as compilers can properly auto-vectorize simple dot
         | products... Moreover, they generally only target single and
         | double precision, while we often prefer half or quarter
         | precision. All in all, meaningless dependency.
         | 
         | What do we use? I wrote a tiny package called SimSIMD. It's
         | idea is to utilize less common SIMD instructions, especially in
         | mixed-typed computations, that are hard for compilers to
         | optimize. It was also a fun exercise to evaluate the
         | performance of new SVE instruction on recent Arm CPUs, like the
         | Graviton 3. You can find the code, the benchmarks, and the
         | results in the repo: https://github.com/ashvardanian/simsimd
         | 
         | Still, even without SimSIMD, USearch seems to be one of the
         | faster implementations of vector search. You can find the
         | benchmarks in the first table here: https://github.com/unum-
         | cloud/usearch#memory-efficiency-down...
        
           | KRAKRISMOTT wrote:
           | The docs recommends compiling for the target machine, does
           | the pip package compile on install?
        
             | ashvardanian wrote:
             | If you install from PyPi default repository - it comes
             | precompiled, but can still be ad-hoc accelerated with JIT-
             | ed metrics. Either way, it should have decent performance.
             | Still, if you wanna push the limits and work with Multi-
             | Terabyte indexes on one node - recompiling locally should
             | help.
        
       | freediver wrote:
       | I am interested in testing this in production, instead of
       | faiss/mrpt.
       | 
       | > metric='cos', # Choose 'l2sq', 'haversine' or other metric,
       | default = 'ip'
       | 
       | As a note, it is actually 'l2_sq' for the Python example.
       | 
       | > index.add(labels=np.arange(len(vectors)), vectors=vectors)
       | 
       | Adding to index appears to be very slow. Also labels are listed
       | as an optional param but the Python SDK has them as required.
       | 
       | Do you have setup of params for 'brute force' approach (100%
       | accuracy)?
        
       | moab wrote:
       | Do you have plans to support metadata filtering?
        
         | momothereal wrote:
         | I was going to ask the same. That is a really important feature
         | to have to replace traditional indexes and usually poorly
         | implemented in vector search libraries.
         | 
         | For example, filtering by arbitrary time range.
        
       | eitan-turok wrote:
       | This looks like a great package. Many vector-search engines do
       | not allow you to implement your own custom distance metrics. But
       | Unum does. Love it!
        
         | ashvardanian wrote:
         | Oh, thank you! The library author here :)
         | 
         | We've just hosted one of our first community/contributor calls
         | a few hours ago, discussing the plans for the upcoming 1.0
         | release, and integration with UCall, UStore, and UForm - our
         | other FOSS libraries. Please don't hesitate to reach out for
         | any questions or feature requests - now is the best time :)
        
           | _0ffh wrote:
           | I know it's a triviality, but you've got a typo in "Hardware-
           | agmostic" you may want to fix.
        
             | ashvardanian wrote:
             | Thanks for suggestion! Details count, nothing is a
             | triviality!
             | 
             | If someone here has free time and C++ experience I am open
             | to recommendations on the codebase and style as well:
             | https://github.com/unum-cloud/usearch/blob/main-
             | dev/include/...
        
               | pilooch wrote:
               | Look at policy-based templating, my goto for generic
               | AI/search algorithms with many options. May prove
               | useful.... Or not :)
        
           | gregw134 wrote:
           | I Have a general vector retrieval question, if you have time
           | to humor me. Suppose I have 10 features per document, each
           | with an embedding. Is it possible to retrieve the document
           | with the highest average embedding score across its features?
           | The only approach I can think of is retrieving the top 1k
           | results across each feature to generate a candidate set, then
           | recomputing full scores for each document.
        
             | ashvardanian wrote:
             | e z!
             | 
             | The simplest way with USearch - concatenate 10 embeddings,
             | define a custom metric with Numba, that takes the average
             | of 10 dot-products. Done :)
        
               | gregw134 wrote:
               | Cool. What if I want a weighted average of embeddings?
               | Going further, is it possible to adjust the weights at
               | search time?
        
               | ashvardanian wrote:
               | Yes, and yes. The last one may be a bit trickier through
               | Python bindings today, but I can easily include that in
               | the next release... shouldn't take more than 50 LOC.
        
               | gregw134 wrote:
               | (replying here because hn limits thread length)
               | 
               | > Yes, and yes. The last one may be a bit trickier
               | through Python bindings today, but I can easily include
               | that in the next release... shouldn't take more than 50
               | LOC.
               | 
               | Appreciate it. That'd be game-changing for me. The
               | ultimate thing I'd like to do is actually use a function
               | of the form score = a _f1(embedding1) + b_ f2(embedding2)
               | + ...
               | 
               | That way you could make adjustments like ignoring
               | feature1 unless its score passes a threshold. I'll try
               | looking at Numba to see if that's possible.
        
       ___________________________________________________________________
       (page generated 2023-07-31 23:00 UTC)