What ANN algorithm should I memorize for the exam?

HNSW is the default on every major managed vector store. IVF / IVF-PQ is the second-most common. Know the trade-offs: HNSW gives best recall and latency at the cost of memory; IVF-PQ trades a small recall hit for huge compression. Exams ask which to choose for which workload.

How do I drill vector and embedding exam questions?

ExamCertAI at ai.examcert.app covers every cloud AI cert with scenario questions on embeddings, retrieval, hybrid search, and reranking — with per-answer explanations. Free, browser-based, and updated for the 2026 exam blueprints.

AI / ML April 25, 2026 12 min read

Vector Databases for AI Certifications 2026: pgvector, OpenSearch, Pinecone

Q: Are vector databases on cloud AI certification exams?

Yes. AWS MLA-C01 covers OpenSearch vector engine, Aurora pgvector, and Bedrock Knowledge Bases. Azure AI-102 covers Azure AI Search vector indexes and Cosmos DB for MongoDB vCore. GCP PMLE covers Vertex Vector Search, AlloyDB and Cloud SQL pgvector. OCI GenAI Pro covers OCI 23ai vector search.

Embeddings and vector stores quietly took over the 2026 cloud AI exam blueprints. Here is what AWS, Azure, GCP and OCI now test — and the fastest way to drill it.

Vector databases pgvector OpenSearch Pinecone for AI certifications 2026

1. Why Vector DBs Are on the Exam Now
2. The Concepts Every Exam Tests
3. AWS Vector Stack
4. Azure Vector Stack
5. GCP Vector Stack
6. OCI & Third-Party
7. Study Plan
8. Frequently Asked Questions

Why Vector DBs Are on the Exam Now

Two years ago, vector search was a niche topic. In 2026 it sits behind every production RAG, agent memory, and semantic-search workload. The exam writers caught up: AWS MLA-C01, Azure AI-102, GCP PMLE, and OCI Generative AI Professional all now have scenario questions on embeddings, retrieval, hybrid search, and reranking. AIF-C01 and AI-900 cover the concepts at a lighter level.

Vector questions on a typical AI cert

Cloud-specific vector services per provider

HNSW

Default ANN index across providers

100%

Of 2026 cloud RAG references use vector search

The Concepts Every Exam Tests

Embeddings Foundational

Dense vector representation of text/image/audio. Cosine vs dot vs Euclidean similarity. Dimension size trade-offs (1536 vs 3072 vs 384). Batch embedding cost optimization.

ANN indexes Hot topic

HNSW (graph-based, default), IVF / IVF-PQ (cluster-based, memory-efficient), DiskANN (disk-resident, large indexes). Trade-off questions: latency vs recall vs memory.

Hybrid search Frequent

BM25 keyword + dense vector with score fusion (RRF, weighted). Most exams now have at least one hybrid-search scenario.

Reranking Differentiator

Cross-encoder rerankers (Cohere Rerank, Vertex Reranker, Bedrock Rerank). When to add one and how it changes latency/cost.

Chunking strategy RAG-critical

Fixed-size, semantic, recursive, parent-child / contextual retrieval. Exams test which chunking strategy fits which document type.

Metadata filtering & security Required

Pre-filter vs post-filter. Tenant isolation in shared indexes. PII redaction in embedded text.

Memorize HNSW. If a question asks "which ANN algorithm" and lists HNSW, IVF, and brute-force, HNSW is the answer 80% of the time on 2026 exams.

AWS Vector Stack

Amazon OpenSearch Service Most tested

OpenSearch k-NN engine with FAISS, Lucene, and NMSLIB backends. HNSW or IVF. Hybrid search via OpenSearch Search Pipelines. The default vector store on Bedrock Knowledge Bases.

Aurora pgvector SQL crowd

PostgreSQL extension. HNSW index. Best when relational data and vectors live together. Aurora Limitless for sharding.

Amazon Kendra Managed RAG

Higher-level managed retrieval service. Less flexibility, more out-of-the-box. Appears in MLA-C01 scenarios when "fully managed" is in the requirements.

Bedrock Knowledge Bases Reference architecture

Wraps OpenSearch / Aurora / MongoDB / Pinecone. Exams test when to use Knowledge Bases vs roll-your-own retrieval.

Azure Vector Stack

Azure AI Search Most tested

HNSW vector index, hybrid search with semantic ranking, integrated vectorization (built-in chunking + embedding pipeline). Default retriever in Azure AI Foundry agents.

Azure Database for PostgreSQL pgvector SQL crowd

HNSW + IVFFlat indexes. DiskANN preview for large datasets.

Cosmos DB for MongoDB vCore NoSQL

HNSW + IVF. Multi-tenant SaaS pattern is a frequent AI-102 scenario.

GCP Vector Stack

Vertex AI Vector Search Most tested

Formerly Matching Engine. Tree-AH / ScaNN under the hood. Highest scale of any managed vector service. PMLE scenarios reward knowing when ScaNN beats HNSW (very large indexes).

AlloyDB AI / Cloud SQL pgvector SQL crowd

HNSW + IVFFlat. AlloyDB AI adds google_ml_integration extension for in-database embeddings.

BigQuery Vector Search Analytics-heavy

VECTOR_SEARCH SQL function. Great for data already living in BigQuery. Tested in PMLE data-platform scenarios.

Drill Vector / Retrieval Scenarios with AI

ExamCertAI covers every cloud AI cert with scenario questions on embeddings, hybrid search, reranking, and chunking — per-question explanations included.

Launch ExamCertAI →

OCI & Third-Party

Oracle 23ai vector search OCI cert

VECTOR data type, HNSW and IVF indexes, native SQL syntax. OCI Generative AI Professional now has 4-6 questions on it.

Pinecone Multi-cloud

Mentioned by name in Bedrock Knowledge Bases options and on Azure AI Foundry connectors. Serverless vs pod-based architecture trade-off appears.

Weaviate, Qdrant, Milvus Open-source

Mentioned in passing on the open-source models post. Less likely to be a correct answer on cloud-vendor exams but useful for portfolio projects.

Study Plan

Week 1: Build a tiny RAG pipeline with pgvector locally. Embed 200 docs, run a similarity query, switch HNSW vs IVFFlat and feel the difference.
Week 2: Migrate the same data to your primary cloud's managed service (OpenSearch, Azure AI Search, or Vertex Vector Search).
Week 3: Add hybrid search and a reranker. Measure recall and latency.
Week 4: Drill vector / retrieval scenarios with ExamCertAI. Pattern recognition on the trade-off questions is what wins exam time.

Plan Your Study Journey

Use our free tools

⏱ Study Time 📊 Compare Certs 🌟 Roadmap

Common trap: "Use brute-force similarity for X million vectors" is almost always wrong. Brute-force is correct only for very small (sub-10K) corpora.

Frequently Asked Questions

Are vector databases on cloud AI certification exams?

Yes. AWS MLA-C01 covers OpenSearch, Aurora pgvector, and Bedrock KB. Azure AI-102 covers Azure AI Search and Cosmos DB. GCP PMLE covers Vertex Vector Search, AlloyDB, and BigQuery vector search. OCI GenAI Pro covers 23ai vector search.

Which vector database should I learn first?

For breadth, start with pgvector — it appears on every cloud. Then learn the managed retrieval service on your primary cloud.

What ANN algorithm should I memorize?

HNSW is the default on every major managed service. Know the IVF and IVF-PQ trade-offs and when DiskANN or ScaNN replaces HNSW.

How do I drill vector exam questions?

Drill scenario questions with ExamCertAI. Free, browser-based, scenario-heavy.

Master Vector / RAG Scenarios

ExamCertAI gives per-answer AI explanations on every question for every major AI cert.

Start Practicing →

ExamCert Team

Cloud AI professionals publishing exam prep that keeps up with production RAG and retrieval practice.

Master AI & Vector Search

ExamCertAI covers every AI cert with per-answer explanations — free.

Launch ExamCertAI More Articles

Table of Contents