Profiling Engine SDK

Main Index
Index
Tutorial
API Functions
Query Language
   
Technology Overview
   
Contact Us
   
 
Other Products
Onix Text Search and Retrieval Engine
Brevity Document Summarizer
Lextek Document Profiler & Categorizer
RouteX Document Routing Engine
Lextek Language Identifier
 

Index

This is the manual for the Lextek Profiling Engine. We have separated the manual into discussions of the API and then the query language. The API consists of those calls you use in your own program to integrate the toolkit into your project. The query language is the internal language that the Profiling Engine uses to analyze the text. You can think of the query language as an interpreted language that specializes in doing document analysis.

About the SDK

Our goal with the Lextek Profiling Engine was to develop a version of our Onix indexing engine that was optimized to meet the needs of the categorization and routing markets. This meant creating numerous new query technologies and rethinking how we analyze texts. For instance most traditional analysis of text have used simple scanners or indexers. Simple scanners lack the flexibility and power that effective analysis requires. Indexers usually are far slower than the Lextek Profiler be because they are optimized for large static disk based indexes. In designing the Lextek Profiler we've optimized it for profiling, rather than attempting to be a jack of all trades.

Most organizations using routers, profilers or categories create "concepts" or categories and compare documents against them. In a sense you have numerous "ideas" about what a document should be like. You then see what documents fit each idea. This is the opposite of what an indexer usually does, where you have a single idea and try to find as many documents that match that idea. Further, most "ideas" in categorizers or profilers are extremely complex and are made up of other "ideas" or "concepts." To aid this we've designed to Lextek Profiler to allow simple and flexible code reuse. We've also designed it to allow your queries to mirror the way you actually think about constructing them. You also can pre-compute your queries and then compare documents against them, rather than having to execute the same queries over and over again.

These optimizations allow you a great deal of power and flexibility. We are confident that you will find that the Lextek Profiling Engine will speed your development and achieve results that you couldn't easily accomplish in the past.

In learning how to integrate the SDK into your project, we suggest reading through the Tutorial. It briefly goes through using the API and then goes through using the query language to analyze documents. After you've looked at the Tutorial, we suggest reading through the API Function Reference. This reference lists each function in the API and describes what it does, how to use it, and lists any arguments that the function takes. Finally we suggest reading through the Query Language Manual. This includes an overview of the query language itself and a reference for all the commands in the language.