When you want to find search results that are relevant to the meaning of the query/question, rather than the exact words/terms used, you need to use semantic search.
Semantic search requires the "meaning" to be somehow numerically represented in the data that you are searching. This is accomplished using embeddings, which are vector representations of tokens (words, characters, etc).
The embeddings data needs to be stored somewhere to perform semantic search. There are quite a few options out there by now, but generally what you're looking for is a vector database. It allows storage, indexing, and fast lookups of embeddings.
What if you didn't want to use a hosted, managed provider to stored your embeddings but somehow keep them in your own database? Well, you're in luck!
Here, I'm going to show you how to use Qdrant in Laravel for semantic search. Qdrant is an open-source vector database that is easy to use and can be self-hosted as well as managed by Qdrant Cloud.
First, let's install a Composer package that makes it easy to use Qdrant in a Laravel app.
composer require hkulekci/qdrant
Next, let's run Qdrant locally using Docker.
I'm going to assume that you have Docker installed on your system. If not, I highly recommend OrbStack rather than Docker's own app. Faster, ligher, and easier to use.
docker pull qdrant/qdrant
docker run -p 6333:6333 qdrant/qdrant
To demonstrate embeddings + semantic search in a more real-world-like scenario, we'll use OpenAI's ada-002 embedding model. Add the OpenAI Laravel package if you haven't already.
composer require openai-php/laravel
php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"
And add your OpenAI API key to your .env file.
# .env
OPENAI_API_KEY="sk-..."
# https://platform.openai.com/account/org-settings
OPENAI_ORGANIZATION="org-..."
Now that the setup is out of the way, let's create a couple of simple console commands to setup or vector index and populate it with embeddings.
// In routes/console.php
use Illuminate\Support\Facades\Artisan;
use OpenAI\Laravel\Facades\OpenAI;
use Qdrant\Http\GuzzleClient;
use Qdrant\Models\Request\CreateCollection;
use Qdrant\Models\Request\VectorParams;
use Qdrant\Qdrant;
use Qdrant\Models\PointsStruct;
use Qdrant\Models\PointStruct;
use Qdrant\Models\VectorStruct;
Artisan::command('setup', function() {
$config = new \Qdrant\Config("http://localhost",6333);
$config->setApiKey('your-api-key-but-anything-on-localhost');
$client = new Qdrant(new GuzzleClient($config));
$createCollection = new CreateCollection();
$createCollection->addVector(new VectorParams(1536, VectorParams::DISTANCE_COSINE), 'saying');
$response = $client->collections('sayings')->create($createCollection);
});
Artisan::command('insert', function() {
$config = new \Qdrant\Config("http://localhost",6333);
$config->setApiKey('your-api-key-but-anything-on-localhost');
$client = new Qdrant(new GuzzleClient($config));
$sayings = [
[
'id' => 'foo', // or e.g., \Ramsey\Uuid\Uuid::uuid4()->toString(),
'value' => 'Canines say woof'
],
[
'id' => 'bar',
'value' => 'Felines say meow',
],
[
'id' => 'baz',
'value' => 'Birds say tweet',
],
[
'id' => 'faz',
'value' => 'Humans say hello',
],
];
$result = OpenAI::embeddings()->create([
'model' => 'text-embedding-ada-002',
'input' => collect($sayings)->pluck('value')->toArray()
]);
$points = new PointsStruct();
foreach ($sayings as $key => $saying) {
$points->addPoint(
new PointStruct(
$saying['id'],
new VectorStruct($result->embeddings[$key]->embedding, 'saying'),
[
'meta' => 'anything'
]
)
);
}
$client->collections('sayings')->points()->upsert($points, ['wait' => 'true']);
});
Mow let's run both:
php artisan setup
php artisan insert
In the terminal window where you ran the Docker command, you should see some output like this:
2023-10-13T09:03:49.680343Z INFO actix_web::middleware::logger: 192.168.215.1 "PUT /collections/sayings HTTP/1.1" 200 48 "-" "GuzzleHttp/7" 0.078868
2023-10-13T09:10:08.231593Z INFO actix_web::middleware::logger: 192.168.215.1 "PUT /collections/sayings/points?wait=true HTTP/1.1" 200 83 "-" "GuzzleHttp/7" 0.010576
And now let's run a query! Back in your console routes file:
// In routes/console.php
use Qdrant\Models\Filter\Condition\MatchString;
use Qdrant\Models\Filter\Filter;
use Qdrant\Models\Request\SearchRequest;
Artisan::command('search', function() {
$config = new \Qdrant\Config("http://localhost",6333);
$config->setApiKey('your-api-key-but-anything-on-localhost');
$client = new Qdrant(new GuzzleClient($config));
$result = OpenAI::embeddings()->create([
'model' => 'text-embedding-ada-002',
'input' => 'What do dogs say?',
]);
$searchRequest = (new SearchRequest(new VectorStruct($result->embeddings[0]->embedding, 'saying')))
// ->setFilter(
// (new Filter())->addMust(
// new MatchString('meta', 'something')
// )
// )
->setLimit(2)
->setParams([
'hnsw_ef' => 128,
'exact' => false,
])
->setWithVector(false)
->setWithPayload(false);
$response = $client->collections('sayings')
->points()
->search($searchRequest);
$this->table(
['saying'],
collect($response['result'])->map(function ($item) {
return [$item['payload']['value']];
})->toArray()
);
});
php artisan search
+------------------+
| saying |
+------------------+
| Canines say woof |
| Felines say meow |
+------------------+
And that's it! You now have embeddings-based semantic search working in a Qdrant vector store with Laravel.