searchAnalytics buildAutoComplete

High Volume Enterprise Knowledge Base Indexing

Scénario de test & Cas d'usage

Business Context

A large engineering firm has a knowledge base with thousands of technical documents. They need to stress-test the autocomplete generation to ensure it can handle a larger volume of similar technical terms (e.g., 'Hydraulic Pump Specification v1', 'Hydraulic Pump Specification v2') without performance degradation.
About the Set : searchAnalytics

Data indexing and search functionalities.

Discover all actions of searchAnalytics
Data Preparation

Generating a larger synthetic dataset with repetitive technical headers to simulate volume.

Copied!
1 
2DATA mycas.tech_docs;
3LENGTH doc_title $100;
4DO i=1 to 10000;
5doc_title = catx(' ', 'Technical Specification Document', 'Version', put(i, 5.), 'for Component', put(mod(i, 10), 2.));
6OUTPUT;
7END;
8 
9RUN;
10PROC CAS;
11search.buildTermIndex / TABLE={name='tech_docs'} docId='doc_title' casOut={name='tech_terms', replace=true};
12 
13RUN;
14 

Étapes de réalisation

1
Building the auto-complete index on the larger technical term dataset.
Copied!
1 
2PROC CAS;
3searchAnalytics.buildAutoComplete / index={name='tech_terms'} casOut={name='kb_autocomplete', replace=true};
4 
5RUN;
6 
2
Validating table info to check row count and size.
Copied!
1 
2PROC CAS;
3TABLE.tableInfo / TABLE={name='kb_autocomplete'};
4 
5RUN;
6 

Expected Result


The system creates the 'kb_autocomplete' table efficiently even with a higher cardinality of terms. The tableInfo action confirms the table exists and has a row count consistent with the unique terms generated from the 10,000 documents.