YndigoBlue · Teradata Extension

Enterprise full-text search, native to Teradata

Add BM25-ranked full-text search to any Teradata database with no middleware, no external search engine, and no changes to your application architecture.

View Documentation View Pricing

Included with all YndigoBlue subscriptions

product_search.sql

-- Step 1: create a full-text index on your existing table
CALL fts_metadata.CREATE_FTS_INDEX_C(
    'products_idx',    -- unique index name
    'mydb',            -- schema
    'products',        -- table
    'product_id',      -- primary key
    'name,description' -- columns to index
    result_msg
);

-- Step 2: search with BM25 relevance scoring
SELECT p.product_id, p.name, p.description, scores.score
FROM (
    SELECT ti.document_id,
           SUM(fts_metadata.BM25_SCORE(
               ti.term_frequency, ds.doc_length,
               idx.avg_doc_length, ts.document_frequency,
               idx.document_count
           )) AS score
    FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless noise cancelling')) AS qt(token, pos)
    INNER JOIN fts_metadata.fts_term_index ti ON ti.term = qt.token AND ti.index_name = 'products_idx'
    INNER JOIN fts_metadata.fts_document_stats ds ON ds.index_name = ti.index_name 
        AND ds.document_id = ti.document_id
    INNER JOIN fts_metadata.fts_term_stats ts ON ts.index_name = ti.index_name AND ts.term = ti.term
    CROSS JOIN fts_metadata.fts_indexes idx WHERE idx.index_name = 'products_idx'
    GROUP BY ti.document_id
) scores
INNER JOIN mydb.products p ON CAST(p.product_id AS VARCHAR(512)) = scores.document_id
ORDER BY scores.score DESC SAMPLE 10;

BM25

Relevance Ranking

TD 17+20

Teradata Versions

Pure SQL

No Middleware

Included

With Subscription

Capabilities

Production-grade search inside Teradata

Built entirely from C UDFs and SQL stored procedures — no external systems, no data movement, no new infrastructure to manage.

BM25 Relevance Scoring

The same probabilistic ranking algorithm used by Elasticsearch and Apache Lucene. Term frequency saturation, document-length normalisation and inverse document frequency — all computed natively in Teradata.

Inverted Index Engine

A full inverted index stored directly in Teradata metadata tables — term frequencies, document statistics and global IDF weights — enabling fast, ranked lookups over millions of documents.

Incremental Updates

Index a single newly-inserted document, update a modified row, or delete a removed document — without rebuilding the entire index. Designed for live production workloads.

Pure SQL Interface

Search and manage indexes entirely through standard Teradata SQL — stored procedure calls for index management, direct SQL joins for querying. No new tools or APIs to learn.

Multi-Column Indexing

Index multiple text columns simultaneously — name, description, tags, notes — with a single index creation call. Queries automatically search across all indexed columns.

Resumable & Robust

Index creation is resumable after interruption. UTF-8 validation catches malformed data gracefully. Comprehensive error recovery keeps your indexes consistent under failure.

Choose Your Indexer

Two indexers, one right fit

Teradata Full-Text Search ships with two indexer variants. Pick the one that matches your data characteristics.

_C · CLOB Recommended default

TOKENIZE_TEXT_C

For documents of any size, including LOB columns. Handles invalid UTF-8 gracefully by skipping malformed sequences and continuing.

Document sizeUnlimited (CLOB)

UTF-8 handlingValidates & recovers

Best forArticles, notes, CLOBs

Teradata 17✓

Teradata 20✓

_V · VARCHAR For shorter documents

TOKENIZE_TEXT_V

Accesses VARCHAR buffers directly via UCS-2, avoiding LOB overhead. The faster option for data that fits within 32 KB per document.

Document sizeUp to 32,000 chars

UTF-8 handlingClean data only

Best forProduct names, titles

Teradata 17✓

Teradata 20✓

Code

Up and running in minutes

Deploy to any Teradata 17 or 20 instance, create your first index, and start returning ranked search results — all through standard SQL.

-- Create a CLOB-based index (recommended — handles any document size)
CALL fts_metadata.CREATE_FTS_INDEX_C(
    'products_idx',          -- globally unique index name
    'mydb',                  -- source schema
    'products',              -- source table
    'product_id',            -- primary key column
    'name,description',      -- comma-separated columns to index
    result_msg               -- OUT: status message
);
SELECT :result_msg;          -- e.g. "Index products_idx created (1,247 documents)"

-- Or the VARCHAR variant — 3x faster when documents are < 32 KB
CALL fts_metadata.CREATE_FTS_INDEX_V(
    'products_idx_fast', 'mydb', 'products', 'product_id', 'sku,name', result_msg
);

-- View all indexes and their statistics
SELECT index_name, source_schema, source_table,
       document_count, total_terms, avg_doc_length
FROM fts_metadata.v_fts_indexes;

-- Rebuild after a bulk data change
CALL fts_metadata.REBUILD_FTS_INDEX_C('products_idx', result_msg);

-- Drop an index when no longer needed
CALL fts_metadata.DROP_FTS_INDEX('products_idx', result_msg);

-- Full BM25 ranked search — works on Teradata 17 and 20
-- Returns the top 10 products most relevant to 'wireless noise cancelling'
SELECT
    p.product_id,
    p.sku,
    p.name,
    p.description,
    scores.score
FROM (
    SELECT
        ti.document_id,
        SUM(
            fts_metadata.BM25_SCORE(
                ti.term_frequency,     -- how often the term appears in this document
                ds.doc_length,         -- length of this document
                idx.avg_doc_length,    -- average length across all documents
                ts.document_frequency, -- how many docs contain this term
                idx.document_count     -- total number of indexed documents
            )
        ) AS score
    FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless noise cancelling')) AS qt(token, pos)
    INNER JOIN fts_metadata.fts_term_index ti
        ON ti.term = qt.token AND ti.index_name = 'products_idx'
    INNER JOIN fts_metadata.fts_document_stats ds
        ON ds.index_name = ti.index_name AND ds.document_id = ti.document_id
    INNER JOIN fts_metadata.fts_term_stats ts
        ON ts.index_name = ti.index_name AND ts.term = ti.term
    CROSS JOIN fts_metadata.fts_indexes idx
        WHERE idx.index_name = 'products_idx'
    GROUP BY ti.document_id
) scores
INNER JOIN mydb.products p
    ON CAST(p.product_id AS VARCHAR(512)) = scores.document_id
ORDER BY scores.score DESC
SAMPLE 10;  -- efficient pagination — no full sort required

-- Teradata 20 only: SEARCH_FTS convenience procedure
-- One call replaces the full BM25 query above
CALL fts_metadata.SEARCH_FTS(
    'products_idx',                    -- index to search
    'wireless noise cancelling',       -- search query
    'mydb.products',                   -- fully-qualified source table
    'product_id',                      -- primary key column
    'product_id, sku, name',           -- columns to return
    10                                 -- maximum results
);

-- Results are written to the global temporary table fts_search_results
SELECT * FROM fts_search_results ORDER BY score DESC;

-- Unranked search (fastest — no BM25 calculation, just matching docs)
SELECT DISTINCT ti.document_id
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('wireless')) AS qt(token, pos)
INNER JOIN fts_metadata.fts_term_index ti
    ON ti.term = qt.token AND ti.index_name = 'products_idx';

-- Debug tokenisation — see how your query terms are split
SELECT token, pos
FROM TABLE(fts_metadata.TOKENIZE_TEXT_C('noise-cancelling headphones (BT)')) AS t(token, pos)
ORDER BY pos;

-- A new product was inserted — index just that row (no full rebuild)
CALL fts_metadata.INDEX_SINGLE_DOCUMENT_C(
    'products_idx',
    CAST(1042 AS VARCHAR(512)),  -- the new product_id
    result_msg
);

-- An existing product was updated — re-index it
CALL fts_metadata.UPDATE_SINGLE_DOCUMENT_C(
    'products_idx',
    CAST(1042 AS VARCHAR(512)),
    result_msg
);

-- A product was deleted — remove it from the index
CALL fts_metadata.DELETE_SINGLE_DOCUMENT(
    'products_idx',
    CAST(1042 AS VARCHAR(512)),
    result_msg
);

-- After a large batch of changes, do a full rebuild instead
-- (automatically replaces index contents and recalculates statistics)
CALL fts_metadata.REBUILD_FTS_INDEX_C('products_idx', result_msg);
SELECT :result_msg;  -- "Index products_idx rebuilt (12,850 documents)"

# 1. Unpack the distribution for your Teradata version
unzip teradata-fts-td20.zip   # or teradata-fts-td17.zip

# 2. Edit credentials in the BTEQ script
nano bteq/install.bteq        # set your hostname, username, password

# 3. Run the deployment (creates fts_metadata schema + all objects)
./deploy.sh                   # Linux / macOS
deploy.bat                    # Windows

# 4. Verify installation
bteq << EOF
.LOGON myhost/myuser,mypassword
SELECT * FROM fts_metadata.v_fts_indexes;
.LOGOFF
EOF

# What gets installed into fts_metadata:
#   3 UDFs        — TOKENIZE_TEXT_C, TOKENIZE_TEXT_V, BM25_SCORE
#   9 procedures  — CREATE, DROP, REBUILD, INDEX/UPDATE/DELETE_SINGLE, SEARCH_FTS
#   4 tables      — fts_indexes, fts_term_index, fts_document_stats, fts_term_stats
#   3 views       — v_fts_indexes, v_fts_db_indexes, v_fts_index_columns
#   2 GTTs        — Working space for CLOB and VARCHAR indexing operations

Requires CREATE DATABASE privileges on the target Teradata instance and approximately 1 GB of permanent space for the fts_metadata schema. Full prerequisites are detailed in the documentation.

Architecture

Everything lives inside Teradata

No sidecar services, no Elasticsearch cluster, no ETL pipelines. The entire system — indexer, scorer, and metadata store — is deployed once into a single Teradata schema.

Compatibility

Teradata 17 and 20 support

All core indexing and search capabilities are available on both major Teradata versions. Teradata 20 gains the additional SEARCH_FTS convenience procedure.

Feature	Teradata 17	Teradata 20
TOKENIZE_TEXT_C / _V UDFs	✓	✓
BM25_SCORE UDF	✓	✓
CREATE / DROP / REBUILD index	✓	✓
Incremental index updates	✓	✓
Direct SQL ranked search	✓	✓
Multi-column indexing	✓	✓
Resumable index creation	✓	✓
SEARCH_FTS convenience procedure	—	✓

Velocity Framework

Full-text search through one unified API

The TeradataFts extension plugs directly into Velocity Framework, translating a single manager.FullTextRetrieval() call into native BM25-ranked SQL — the exact same method that drives full-text search on every other supported datasource.

No vendor knowledge required. Create a FullTextIndex, call FullTextRetrieval() — Velocity takes care of everything else. You never write a database-specific query.

Switch datasource, change nothing else. Move from SQL Server to Teradata by changing a single connection object. Every model, index definition, and search call in your application stays exactly as written.

BM25 on Teradata, transparently. The TeradataFts extension delivers the same high-quality relevance ranking available on other platforms — surfaced through the same Velocity API, with no extra work on your part.

using YndigoBlue.Velocity.Engine;
using YndigoBlue.Velocity.Model;

// Define a full-text index — the same object regardless of which database
// you are targeting. Velocity handles all vendor-specific DDL for you.
var idx = new FullTextIndex("product_search_idx", Product.Table)
{
    IndexColumn = "product_id",
    Columns     = ["name", "description"]
};

using var manager = new Manager(connection);

// Deploy — no knowledge of the underlying database required
manager.CreateFullTextIndex(idx, overwrite: false);

// Rebuild after a bulk data change
manager.UpdateFullTextIndex(idx);

using YndigoBlue.Velocity.Engine;
using YndigoBlue.Velocity.Model;

using var manager = new Manager(connection);

// One call. Any datasource. No vendor-specific knowledge required.
IList<Product> results = manager
    .FullTextRetrieval(
        Product.Table,
        [Product.Table["name"],
         Product.Table["description"]],
        "wireless noise cancelling")
    .BindResults<Product>(BindingType.Attribute);

// Switching from SQL Server to Teradata?
// Change 'connection' — nothing else in your application changes.

TeradataFts is included alongside Velocity Framework in all YndigoBlue subscriptions. Learn about Velocity Framework →

Add search to your Teradata data today

Teradata Full-Text Search is included with all YndigoBlue subscriptions — no separate licence required. Deploy in minutes, search in seconds.

View Documentation View Pricing