Enterprise full-text search, native to Teradata
Add BM25-ranked full-text search to any Teradata database with no middleware, no external search engine, and no changes to your application architecture.
-- Step 1: create a full-text index on your existing table
CALL fts_metadata.CREATE_FTS_INDEX_C(
'products_idx', -- unique index name
'mydb', -- schema
'products', -- table
'product_id', -- primary key
'name,description' -- columns to index
result_msg
);
-- Step 2: search with BM25 relevance scoring
SELECT p.product_id, p.name, p.description, scores.score
FROM (
SELECT ti.document_id,
SUM(fts_metadata.BM25_SCORE(
ti.term_frequency, ds.doc_length,
idx.avg_doc_length, ts.document_frequency,
idx.document_count
)) AS score
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless noise cancelling')) AS qt(token, pos)
INNER JOIN fts_metadata.fts_term_index ti ON ti.term = qt.token AND ti.index_name = 'products_idx'
INNER JOIN fts_metadata.fts_document_stats ds ON ds.index_name = ti.index_name
AND ds.document_id = ti.document_id
INNER JOIN fts_metadata.fts_term_stats ts ON ts.index_name = ti.index_name AND ts.term = ti.term
CROSS JOIN fts_metadata.fts_indexes idx WHERE idx.index_name = 'products_idx'
GROUP BY ti.document_id
) scores
INNER JOIN mydb.products p ON CAST(p.product_id AS VARCHAR(512)) = scores.document_id
ORDER BY scores.score DESC SAMPLE 10;
Production-grade search inside Teradata
Built entirely from C UDFs and SQL stored procedures — no external systems, no data movement, no new infrastructure to manage.
BM25 Relevance Scoring
The same probabilistic ranking algorithm used by Elasticsearch and Apache Lucene. Term frequency saturation, document-length normalisation and inverse document frequency — all computed natively in Teradata.
Inverted Index Engine
A full inverted index stored directly in Teradata metadata tables — term frequencies, document statistics and global IDF weights — enabling fast, ranked lookups over millions of documents.
Incremental Updates
Index a single newly-inserted document, update a modified row, or delete a removed document — without rebuilding the entire index. Designed for live production workloads.
Pure SQL Interface
Search and manage indexes entirely through standard Teradata SQL — stored procedure calls for index management, direct SQL joins for querying. No new tools or APIs to learn.
Multi-Column Indexing
Index multiple text columns simultaneously — name, description, tags, notes — with a single index creation call. Queries automatically search across all indexed columns.
Resumable & Robust
Index creation is resumable after interruption. UTF-8 validation catches malformed data gracefully. Comprehensive error recovery keeps your indexes consistent under failure.
Two indexers, one right fit
Teradata Full-Text Search ships with two indexer variants. Pick the one that matches your data characteristics.
TOKENIZE_TEXT_C
For documents of any size, including LOB columns. Handles invalid UTF-8 gracefully by skipping malformed sequences and continuing.
TOKENIZE_TEXT_V
Accesses VARCHAR buffers directly via UCS-2, avoiding LOB overhead. The faster option for data that fits within 32 KB per document.
Up and running in minutes
Deploy to any Teradata 17 or 20 instance, create your first index, and start returning ranked search results — all through standard SQL.
-- Create a CLOB-based index (recommended — handles any document size)
CALL fts_metadata.CREATE_FTS_INDEX_C(
'products_idx', -- globally unique index name
'mydb', -- source schema
'products', -- source table
'product_id', -- primary key column
'name,description', -- comma-separated columns to index
result_msg -- OUT: status message
);
SELECT :result_msg; -- e.g. "Index products_idx created (1,247 documents)"
-- Or the VARCHAR variant — 3x faster when documents are < 32 KB
CALL fts_metadata.CREATE_FTS_INDEX_V(
'products_idx_fast', 'mydb', 'products', 'product_id', 'sku,name', result_msg
);
-- View all indexes and their statistics
SELECT index_name, source_schema, source_table,
document_count, total_terms, avg_doc_length
FROM fts_metadata.v_fts_indexes;
-- Rebuild after a bulk data change
CALL fts_metadata.REBUILD_FTS_INDEX_C('products_idx', result_msg);
-- Drop an index when no longer needed
CALL fts_metadata.DROP_FTS_INDEX('products_idx', result_msg);
-- Full BM25 ranked search — works on Teradata 17 and 20
-- Returns the top 10 products most relevant to 'wireless noise cancelling'
SELECT
p.product_id,
p.sku,
p.name,
p.description,
scores.score
FROM (
SELECT
ti.document_id,
SUM(
fts_metadata.BM25_SCORE(
ti.term_frequency, -- how often the term appears in this document
ds.doc_length, -- length of this document
idx.avg_doc_length, -- average length across all documents
ts.document_frequency, -- how many docs contain this term
idx.document_count -- total number of indexed documents
)
) AS score
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless noise cancelling')) AS qt(token, pos)
INNER JOIN fts_metadata.fts_term_index ti
ON ti.term = qt.token AND ti.index_name = 'products_idx'
INNER JOIN fts_metadata.fts_document_stats ds
ON ds.index_name = ti.index_name AND ds.document_id = ti.document_id
INNER JOIN fts_metadata.fts_term_stats ts
ON ts.index_name = ti.index_name AND ts.term = ti.term
CROSS JOIN fts_metadata.fts_indexes idx
WHERE idx.index_name = 'products_idx'
GROUP BY ti.document_id
) scores
INNER JOIN mydb.products p
ON CAST(p.product_id AS VARCHAR(512)) = scores.document_id
ORDER BY scores.score DESC
SAMPLE 10; -- efficient pagination — no full sort required
-- Teradata 20 only: SEARCH_FTS convenience procedure
-- One call replaces the full BM25 query above
CALL fts_metadata.SEARCH_FTS(
'products_idx', -- index to search
'wireless noise cancelling', -- search query
'mydb.products', -- fully-qualified source table
'product_id', -- primary key column
'product_id, sku, name', -- columns to return
10 -- maximum results
);
-- Results are written to the global temporary table fts_search_results
SELECT * FROM fts_search_results ORDER BY score DESC;
-- Unranked search (fastest — no BM25 calculation, just matching docs)
SELECT DISTINCT ti.document_id
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless')) AS qt(token, pos)
INNER JOIN fts_metadata.fts_term_index ti
ON ti.term = qt.token AND ti.index_name = 'products_idx';
-- Debug tokenisation — see how your query terms are split
SELECT token, pos
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('noise-cancelling headphones (BT)')) AS t(token, pos)
ORDER BY pos;
-- A new product was inserted — index just that row (no full rebuild)
CALL fts_metadata.INDEX_SINGLE_DOCUMENT_C(
'products_idx',
CAST(1042 AS VARCHAR(512)), -- the new product_id
result_msg
);
-- An existing product was updated — re-index it
CALL fts_metadata.UPDATE_SINGLE_DOCUMENT_C(
'products_idx',
CAST(1042 AS VARCHAR(512)),
result_msg
);
-- A product was deleted — remove it from the index
CALL fts_metadata.DELETE_SINGLE_DOCUMENT(
'products_idx',
CAST(1042 AS VARCHAR(512)),
result_msg
);
-- After a large batch of changes, do a full rebuild instead
-- (automatically replaces index contents and recalculates statistics)
CALL fts_metadata.REBUILD_FTS_INDEX_C('products_idx', result_msg);
SELECT :result_msg; -- "Index products_idx rebuilt (12,850 documents)"
# 1. Unpack the distribution for your Teradata version
unzip teradata-fts-td20.zip # or teradata-fts-td17.zip
# 2. Edit credentials in the BTEQ script
nano bteq/install.bteq # set your hostname, username, password
# 3. Run the deployment (creates fts_metadata schema + all objects)
./deploy.sh # Linux / macOS
deploy.bat # Windows
# 4. Verify installation
bteq << EOF
.LOGON myhost/myuser,mypassword
SELECT * FROM fts_metadata.v_fts_indexes;
.LOGOFF
EOF
# What gets installed into fts_metadata:
# 3 UDFs — TOKENIZE_TEXT_C, TOKENIZE_TEXT_V, BM25_SCORE
# 9 procedures — CREATE, DROP, REBUILD, INDEX/UPDATE/DELETE_SINGLE, SEARCH_FTS
# 4 tables — fts_indexes, fts_term_index, fts_document_stats, fts_term_stats
# 3 views — v_fts_indexes, v_fts_db_indexes, v_fts_index_columns
# 2 GTTs — Working space for CLOB and VARCHAR indexing operations
Requires CREATE DATABASE privileges on the target Teradata instance and approximately 1 GB of permanent space for the fts_metadata schema. Full prerequisites are detailed in the documentation.
Everything lives inside Teradata
No sidecar services, no Elasticsearch cluster, no ETL pipelines. The entire system — indexer, scorer, and metadata store — is deployed once into a single Teradata schema.
Teradata 17 and 20 support
All core indexing and search capabilities are available on both major Teradata versions. Teradata 20 gains the additional SEARCH_FTS convenience procedure.
| Feature | Teradata 17 | Teradata 20 |
|---|---|---|
| TOKENIZE_TEXT_C / _V UDFs | ✓ | ✓ |
| BM25_SCORE UDF | ✓ | ✓ |
| CREATE / DROP / REBUILD index | ✓ | ✓ |
| Incremental index updates | ✓ | ✓ |
| Direct SQL ranked search | ✓ | ✓ |
| Multi-column indexing | ✓ | ✓ |
| Resumable index creation | ✓ | ✓ |
| SEARCH_FTS convenience procedure | — | ✓ |
Full-text search through one unified API
The TeradataFts extension plugs directly into Velocity Framework, translating a single manager.FullTextRetrieval() call into native BM25-ranked SQL — the exact same method that drives full-text search on every other supported datasource.
No vendor knowledge required. Create a FullTextIndex, call FullTextRetrieval() — Velocity takes care of everything else. You never write a database-specific query.
Switch datasource, change nothing else. Move from SQL Server to Teradata by changing a single connection object. Every model, index definition, and search call in your application stays exactly as written.
BM25 on Teradata, transparently. The TeradataFts extension delivers the same high-quality relevance ranking available on other platforms — surfaced through the same Velocity API, with no extra work on your part.
using YndigoBlue.Velocity.Engine;
using YndigoBlue.Velocity.Model;
// Define a full-text index — the same object regardless of which database
// you are targeting. Velocity handles all vendor-specific DDL for you.
var idx = new FullTextIndex("product_search_idx", Product.Table)
{
IndexColumn = "product_id",
Columns = ["name", "description"]
};
using var manager = new Manager(connection);
// Deploy — no knowledge of the underlying database required
manager.CreateFullTextIndex(idx, overwrite: false);
// Rebuild after a bulk data change
manager.UpdateFullTextIndex(idx);
using YndigoBlue.Velocity.Engine;
using YndigoBlue.Velocity.Model;
using var manager = new Manager(connection);
// One call. Any datasource. No vendor-specific knowledge required.
IList<Product> results = manager
.FullTextRetrieval(
Product.Table,
[Product.Table["name"],
Product.Table["description"]],
"wireless noise cancelling")
.BindResults<Product>(BindingType.Attribute);
// Switching from SQL Server to Teradata?
// Change 'connection' — nothing else in your application changes.
TeradataFts is included alongside Velocity Framework in all YndigoBlue subscriptions. Learn about Velocity Framework →
Add search to your Teradata data today
Teradata Full-Text Search is included with all YndigoBlue subscriptions — no separate licence required. Deploy in minutes, search in seconds.