I wanted search over my own notes that understood meaning, not just keywords, and I did not want my notes leaving the machine to get it. It turns out you no longer have to choose. A small sentence-embedding model runs perfectly happily on a laptop CPU, and "small" here is a few hundred megabytes, not the kind of thing that needs a GPU and a second mortgage.
The shape is unglamorous and that is the appeal. Embed each note into a vector once. Embed the query the same way at search time. Compare with cosine similarity and return the nearest. No API key, no per-request cost, no quietly shipping my half-finished thoughts to someone else's servers to be logged.
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer("all-MiniLM-L6-v2")
doc_vecs = model.encode(docs, convert_to_tensor=True)
hits = util.semantic_search(model.encode(query), doc_vecs, top_k=5)
For a few thousand notes you do not even need a vector database. The whole set of vectors fits in memory and a brute-force comparison returns in milliseconds. I bolted a vector store on later, but only because I enjoy the tidiness, not because the maths demanded it.
The results are not magic. It will not answer questions. But it finds the note I half-remember from a phrase that shares no words with what I actually wrote, and it does it offline, on my hardware, with my data staying mine. For personal search that is exactly the right amount of clever.