The universal semantic layer between enterprise data and AI.·Try it →
// 01 — the pitch

Semantic search for Polars.

Stop writing regex to scrub patient data.

Omna gives Polars semantic search and PII masking — in one line of Python.

Local-first. Rust-powered. Zero data egress.

Search your DataFrames by meaning, not strings. Mask sensitive columns before they ever reach a model.

Local-firstRust kernel0 network callsHIPAA · ready
Try it now ↓
// 2027 mission

We're building the universal semantic layer between enterprise data and AI. We started with Polars because that's where the fastest-growing data engineering community is. Our roadmap takes us to every format and every model.

~12ms
p50 semantic search
1M rows · M2 Macbook
0
network calls
ever, by design
100%
local execution
no vendor BAA needed
patients.py
# Before — regex hell
import re, polars as pl
df = pl.read_parquet("patients.parquet")
 
PAIN = re.compile(r"chest|cardiac|angina|heart\s+pain", re.I)
hits = []
for row in df.iter_rows(named=True):
if PAIN.search(row["notes"] or ""):
hits.append(row)
# ...still missing 'shortness of breath',
# 'tightness', 'pressure'... etc. forever.
~30 lines · brittle · missing synonymsauto-toggles · click to lock
python · pipeline.py● live
// why this matters

The bridge between smart models and invisible data.

// the principle
"They cannot see your data until you show it to them in exactly the right way."
// the analogy
"Omna is to AI models what a good search engine is to websites."
// the wedge
"Google didn't kill websites — it made them more findable, and therefore more valuable. Omna doesn't replace AI. It makes AI usable on data that was previously unreachable — and therefore drives more AI usage."
// the bridge
"AI is smart but blind. Your data is rich but invisible to AI. Omna is the bridge."

Performance-critical similarity engine written in Rust · local-first by design.

// 02 — try it now

Don't watch a demo. Break it.

No signup. No data leaves your machine. Real Polars syntax, real output, runs entirely in your browser.

dataset
playground.py
1import polars as pl
2import omna # registers .omna namespace
3
4df = pl.read_parquet("healthcare.parquet")
5
6(df.omna
7 .search("heart pain", top_k=3)
8 .collect())
top_k3
output · DataFrame
readypolars 1.18 · omna 0.3.1
ⓘ browser demo · local · no telemetry
// terminal output will appear here
// compatible with
Polars
Apache Arrow
Parquet
Rust
PyO3
DuckDB
Hugging Face
ONNX
LanceDB
Delta Lake
Pandas (compat)
JupyterLab
Polars
Apache Arrow
Parquet
Rust
PyO3
DuckDB
Hugging Face
ONNX
LanceDB
Delta Lake
Pandas (compat)
JupyterLab
// 03 — capabilities

Search first. Mask second. Then filter or ask questions. Everything else on the way.

Every tile below is live — the same Rust kernel that runs in your notebook is rendering this page.

// .search()

Search by meaning, not strings.

chest pain after exertion
0.94
0.87
0.82
// latency

p50 on 1M rows

live · last 24 runs12ms
// pii masking

df.omna.mask_pii() — one line, audit-logged.

nameemailnote
Sarah Chensarah.chen@stanford.eduPatient 0421 · MRN 88231
James Patelj.patel@mayo.orgPatient 0422 · MRN 88401
Lin Okaforlokafor@hopkins.eduPatient 0423 · MRN 88502
mode: rawPERSON · EMAIL_ADDRESS
// .understand()

Schemas that read themselves.

patient_idIDmedium
emailEMAILhigh
diagnosisSEARCHABLEnone
dobDATEhigh
// .filter()live

Filter by intent, not LIKE.

 df.omna.filter("kids under 12 with chest pain")
try it in the playground →
// .ask()live

Ask in English. Get rows.

 df.omna.ask("flag anyone at risk of CHF")
try it in the playground →
// local-first

Zero network calls. Ever.

egress packets0
vendor API calls0
data leaves machinenever
// why omna is fast

Built on Polars — so embeddings never leave Arrow memory.

Omna is built natively on Polars — which means it inherits Arrow's columnar memory format, SIMD vectorization, and zero-copy data structures.

Your embeddings never leave the Arrow memory layout. No serialization. No copying. No overhead.

When you call df.omna.search(), the Rust similarity kernel operates directly on the same memory Polars is already using.

// architecture note
"Polars uses Apache Arrow's columnar memory format with SIMD vectorization — the same memory our Rust similarity kernel operates on directly. On Pandas we'd need to copy data into NumPy arrays first. On Polars it's zero-copy end to end."

Ritchie Vink's original post on the Polars architecture explains it better than we can. Read it →

memory
Apache Arrow
vectorization
SIMD
kernel
Rust · zero-copy
copies
0
// 04 — what teams are saying

Quiet adoption. Loud audits.

Anonymized while teams are still in private beta. Names dropped on request.

"
Replaced 400 lines of regex with one method call. The audit team stopped opening tickets.
Senior data engineer
healthtech · series-B
"
The first tool that didn't ask me to ship our patient records to a vendor for an embedding.
ML platform lead
regional hospital network
"
`.understand()` caught a PHI column three audits had missed. We bought the cloud tier the same week.
Data governance
fintech · regulated
Used by teams in healthtech, fintech, and government data programs.Using Omna in production? Email hello@omna.dev →
// roadmap

A platform, not a script.

We're building a complete semantic layer for regulated data on Polars. Here's the path.

Q1 2026Shipped

Semantic Search v0.3

Local-first vector search across Polars columns. ~12ms on 1M rows.

Q1 2026Shipped

PII Masking Suite

Built-in detectors for SSN, PHI, payment data, plus custom regex graphs.

Q2 2026Coming Soon

Semantic Joins

Fuzzy joins on meaning, not strings. df_a.omna.join(df_b, on='intent').

Q3 2026Planned

Embedded Re-ranker

Optional cross-encoder pass entirely in Rust. No Python in the hot loop.

Q4 2026Planned

Federated Search

Query across machines without moving the data. SOC2-aligned audit log.

2027Planned

Omna Cloud (opt-in)

Hosted control plane for teams. Local kernel stays the source of truth.

Available now// pip install omna

Open-source core. Free, forever.

  • · Local-first Rust kernel
  • · Search · Mask · Understand
  • · MIT-licensed · runs offline
View on GitHub ↗
Coming soon// for teams

Omna Cloud. Pricing TBD.

  • · Federated search across machines
  • · SOC2-aligned audit log
  • · Hosted control plane (your data stays local)
// 06 — why this exists

Built by an engineer who lived the pain.

Four years writing brittle regex pipelines to scrub patient records before they could touch a model. Every new column type meant a new edge case, a new on-call page, a new audit finding.

Omna is the tool that should have existed: a Polars-native API where .search() and .mask_pii() are first-class operators, the kernel is Rust, and your data never leaves the machine. If you've felt this pain, we'd like to hear from you.

hello@omna.dev →
// join the founding team

We're hiring Rust & Python engineers.

Help build the kernel that handles a billion sensitive rows without breaking a sweat. Founding team equity, async-first, strong opinions weakly held.

RustPolarsArrowPythonVector DBsPHI / HIPAA
// feedback loop

Tell us what's broken.

Bug reports, feature requests, or "this would be 10× more useful if…" — we read every one.

Stored in Postgres · Never shared