Extractor - Automatic AI Keyword Keyphrase Extraction - DBI Technologies Inc.

A New World of Contextually Relevant Information

Making Content Findable^TM Extractor is an agnostic* Text Analytics technology that automatically, without biased human intervention, parses any subject domain content - text, news, unstructured information, documents, email, web pages ... into relevant and contextually accurate kephrase highlights. In perfect Context, Accurately and with absolute Relevance.

Uniquely positioned for web services, Extractor and its xAIgent web service can be immediately deployed consuming documents of any length and subject matter - distilling that information, news, textual content into precise, contextual, personally meaningful information presented in keyword and key phrase summaries. Extractor's unique patented technology delivers precise content summaries from any subject domain automatically - without training or human intervention.

Contextual

A unique feature of the patented Extractor technology is the ability to summarize content by showing how keywords and key phrases are used in context of a document allowing for accurate definition of terminology and use of the subject. Resulting summaries provide unparalleled levels of subject relevance. In particular allowing exact analytical comparison of information and news items to other but contextually different. Analyzing collections of documents with contextual relevance is now possible.

Relevant Information

By design Extractor is an objective provider of content summaries in contrast to traditional human influenced subjective summary approaches or approximated systems using Bayesian and Heuristic approaches. Statistically proven, Extractor is 85% to 93% accurate across all subject domains. The ability to quickly discern relevant and meaningful news and information - in personal context - is the corner stone of the Extractor Technology.

Definition of Key Phrase Extraction

Many journals ask their authors to provide a list of key words for their articles. We call these key phrases, rather than key words, because they are often phrases of two or more words, rather than single words. We define a key phrase list as a short list of phrases (typically five to fifteen phrases) that capture the main topics discussed in a given document. We define automatic key phrase extraction as the automatic selection of important, topical phrases from within the body of a document. Automatic key phrase extraction is a special case of the more general task of automatic key phrase generation, in which the generated phrases do not necessarily appear in the body of the given document.

Key Phrases for Metadata

Many researchers believe that metadata is essential to address the problems of document management. Metadata is meta-information about a document or set of documents. There are several standards for document metadata, including the Dublin Core Metadata Element Set (championed by the US Online Computer Library Center), the MARC (Machine-Readable Cataloging) format (maintained by the US Library of Congress), the GILS (Government Information Locator Service) standard (from the US Office of Social and Economic Data Analysis), and the CSDGM (Content Standards for Digital Geospatial Metadata) standard (from the US Federal Geographic Data Committee). All of these standards include a field for key phrases (although they have different names for this field).

Key Phrases for Highlighting

When we skim a document, we scan for key phrases, to quickly determine the topic of the document. Highlighting is the practice of emphasizing key phrases and key passages (e.g., sentences or paragraphs) by underlining the key text, using a special font, or marking the key text with a special colour. The purpose of highlighting is to facilitate skimming. Automatic key phrase extraction can be used for highlighting and also to enable text-to-speech software to provide audio skimming capability.

Key Phrases for Indexing

An alphabetical list of key phrases, taken from a collection of documents or from parts of a single long document (chapters in a book), can serve as an index.

Interactive Query Refinement

Using a search engine is often an iterative process. The user enters a query, examines the resulting hit list, modifies the query, then tries again. Most search engines do not have any special features that support the iterative aspect of searching. One approach to interactive query refinement is to take the user's query, fetch the first round of documents, extract key phrases from them, and then display the first round of documents to the user, along with suggested refinements to the first query, based on combinations of the first query with the extracted key phrases.

Key Phrases for Web Log Analysis

Web site managers often want to know what visitors to their site are seeking. Most web servers have log files that record information about visitors, including the Internet address of the client machine, the file that was requested by the client, and the date and time of the request. There are several commercial products that analyze these logs for web site managers. Typically these tools will give a summary of general traffic patterns and produce an ordered list of the most popular files on the web site. A web log analysis program can use key phrases to provide a deeper view of traffic. Instead of producing an ordered list of the most popular files on the web site, a log analysis tool can produce a list of the most popular key phrases on the site. This can give web site managers insight into which topics on their web site are most popular.

Workforce Optimization

Relevant information is a critical tool for the success of any business today and providing relevant information in the right context is what gives an organization an ultimate competitive advantage. Rather than working through traditional, time consuming, iterative search processes - engage the Text Mining power of Extractor to empower information workers with relevant and meaningful results in direct relation to their needs and those of today's dynamic workforce.

* Agnostic - development platform, language and operating system

Features

Evaluate

Online
Web Service

Platform

          operating system
                    Windows
                    Linux
                    Mac OS

            development
                    C / C#
                    Java
                    Perl
                    Python
                    Visual Basic

API Functions

FAQ

Extractor is Great for...

          workforce optimization
          web log tagging
          refined search
          knowledge management (KM)
          information retrieval (IR)
          semantic web development
          indexing
          categorization
          cataloguing
          inference engines
          document management
          Portal Services

Examples:

          Research
          Internet Communications
          HomeLand Security
          Contextual Web Search
          Document Management
          Indexing
          Knowledge Management
          Intellectual Property Filter
          Intelligent Search
          Text Summarization
          Wireless - Content Summary

Supporting Documentation

the world of relevant information
in the palm of your hand

info at extractor.com | +1 204-985-5774