AI-Native Portfolio

Overview

A production-ready API for semantic search. Upload documents, automatically chunk and embed them, then search with natural language queries.

Architecture

Client → FastAPI → S3 Vectors
                 ↓
              Bedrock (Cohere Embed)

Ingest: Documents chunked with overlap, embedded via Bedrock
Index: Vectors stored in S3 Vectors with metadata
Query: Query embedded, k-NN search, results ranked

Key Features

Automatic chunking — Configurable chunk size and overlap
Metadata filtering — Filter by document type, date, tags
Hybrid search — Combine semantic and keyword matching
Batch processing — Async document ingestion

API Design

@app.post("/documents")
async def ingest_document(
    file: UploadFile,
    metadata: DocumentMetadata = Depends()
) -> IngestResponse:
    chunks = chunk_document(file)
    embeddings = await embed_batch(chunks)
    await index_vectors(embeddings, metadata)
    return IngestResponse(chunks=len(chunks))

@app.post("/search")
async def search(
    query: str,
    filters: SearchFilters = None,
    limit: int = 10
) -> SearchResponse:
    embedding = await embed(query)
    results = await vector_search(embedding, filters, limit)
    return SearchResponse(results=results)

Performance

p50 search latency: 45ms
p99 search latency: 120ms
Throughput: 500 queries/second (single instance)

Scales horizontally — S3 Vectors handles the vector search, API is stateless.

Semantic Search API

Overview

Architecture

Key Features

API Design

Performance

Related Content