Using the Qdrant Vector database for Semantic Search in Laravel

When you want to find search results that are relevant to the meaning of the query/question, rather than the exact words/terms used, you need to use semantic search.

Semantic search requires the "meaning" to be somehow numerically represented in the data that you are searching. This is accomplished using embeddings, which are vector representations of tokens (words, characters, etc).

šŸ¤–
Learn more about embeddings in AI chat and LLM positional attention in my AI with Laravel course.
Join 216 other learners. šŸš€
Course →

The embeddings data needs to be stored somewhere to perform semantic search. There are quite a few options out there by now, but generally what you're looking for is a vector database. It allows storage, indexing, and fast lookups of embeddings.

What if you didn't want to use a hosted, managed provider to stored your embeddings but somehow keep them in your own database? Well, you're in luck!

Here, I'm going to show you how to use Qdrant in Laravel for semantic search. Qdrant is an open-source vector database that is easy to use and can be self-hosted as well as managed by Qdrant Cloud.

First, let's install a Composer package that makes it easy to use Qdrant in a Laravel app.

composer require hkulekci/qdrant

Next, let's run Qdrant locally using Docker.

I'm going to assume that you have Docker installed on your system. If not, I highly recommend OrbStack rather than Docker's own app. Faster, ligher, and easier to use.

docker pull qdrant/qdrant
docker run -p 6333:6333 qdrant/qdrant

To demonstrate embeddings + semantic search in a more real-world-like scenario, we'll use OpenAI's ada-002 embedding model. Add the OpenAI Laravel package if you haven't already.

composer require openai-php/laravel
        
php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"

And add your OpenAI API key to your .env file.

# .env
        
OPENAI_API_KEY="sk-..."

# https://platform.openai.com/account/org-settings
OPENAI_ORGANIZATION="org-..."

Now that the setup is out of the way, let's create a couple of simple console commands to setup or vector index and populate it with embeddings.

// In routes/console.php

use Illuminate\Support\Facades\Artisan;
use OpenAI\Laravel\Facades\OpenAI;
use Qdrant\Http\GuzzleClient;
use Qdrant\Models\Request\CreateCollection;
use Qdrant\Models\Request\VectorParams;
use Qdrant\Qdrant;

use Qdrant\Models\PointsStruct;
use Qdrant\Models\PointStruct;
use Qdrant\Models\VectorStruct;

Artisan::command('setup', function() {
    $config = new \Qdrant\Config("http://localhost",6333);
    $config->setApiKey('your-api-key-but-anything-on-localhost');

    $client = new Qdrant(new GuzzleClient($config));

    $createCollection = new CreateCollection();
    $createCollection->addVector(new VectorParams(1536, VectorParams::DISTANCE_COSINE), 'saying');
    $response = $client->collections('sayings')->create($createCollection);
});

Artisan::command('insert', function() {

    $config = new \Qdrant\Config("http://localhost",6333);
    $config->setApiKey('your-api-key-but-anything-on-localhost');

    $client = new Qdrant(new GuzzleClient($config));

    $sayings = [
        [
            'id' => 'foo', // or e.g., \Ramsey\Uuid\Uuid::uuid4()->toString(),
            'value' => 'Canines say woof'
        ],
        [
            'id' => 'bar',
            'value' => 'Felines say meow',
        ],
        [
            'id' => 'baz',
            'value' => 'Birds say tweet',
        ],
        [
            'id' => 'faz',
            'value' => 'Humans say hello',
        ],
    ];

    $result = OpenAI::embeddings()->create([
        'model' => 'text-embedding-ada-002',
        'input' => collect($sayings)->pluck('value')->toArray()
    ]);

    $points = new PointsStruct();
    foreach ($sayings as $key => $saying) {
        $points->addPoint(
            new PointStruct(
                $saying['id'],
                new VectorStruct($result->embeddings[$key]->embedding, 'saying'),
                [
                    'meta' => 'anything'
                ]
            )
        );
    }
    $client->collections('sayings')->points()->upsert($points, ['wait' => 'true']);
});

Mow let's run both:

php artisan setup
php artisan insert

In the terminal window where you ran the Docker command, you should see some output like this:

2023-10-13T09:03:49.680343Z  INFO actix_web::middleware::logger: 192.168.215.1 "PUT /collections/sayings HTTP/1.1" 200 48 "-" "GuzzleHttp/7" 0.078868
2023-10-13T09:10:08.231593Z  INFO actix_web::middleware::logger: 192.168.215.1 "PUT /collections/sayings/points?wait=true HTTP/1.1" 200 83 "-" "GuzzleHttp/7" 0.010576

And now let's run a query! Back in your console routes file:

// In routes/console.php

use Qdrant\Models\Filter\Condition\MatchString;
use Qdrant\Models\Filter\Filter;
use Qdrant\Models\Request\SearchRequest;

Artisan::command('search', function() {

    $config = new \Qdrant\Config("http://localhost",6333);
    $config->setApiKey('your-api-key-but-anything-on-localhost');

    $client = new Qdrant(new GuzzleClient($config));

    $result = OpenAI::embeddings()->create([
        'model' => 'text-embedding-ada-002',
        'input' => 'What do dogs say?',
    ]);

    $searchRequest = (new SearchRequest(new VectorStruct($result->embeddings[0]->embedding, 'saying')))
        // ->setFilter(
        //     (new Filter())->addMust(
        //         new MatchString('meta', 'something')
        //     )
        // )
        ->setLimit(2)
        ->setParams([
            'hnsw_ef' => 128,
            'exact' => false,
        ])
        ->setWithVector(false)
        ->setWithPayload(false);

    $response = $client->collections('sayings')
        ->points()
        ->search($searchRequest);

    $this->table(
        ['saying'],
        collect($response['result'])->map(function ($item) {
            return [$item['payload']['value']];
        })->toArray()
    );
    
});

php artisan search

+------------------+
| saying           |
+------------------+
| Canines say woof |
| Felines say meow |
+------------------+

And that's it! You now have embeddings-based semantic search working in a Qdrant vector store with Laravel.

šŸ¤©

Want to learn how to do more interesting, real-life things with embeddings and semantic search? Check out my AI with Laravel video course.

I show you chunking, batching, namespaces, deletions, OpenAI embeddings, and how to combine all of that into cool AI web app experiences, such as chatting with documents, searching the web, and building a sales/support bot widget for your own website!

Join 216 other learners. šŸš€
Course →