20 Essential Latent Semantic
Indexing Defintions

Written by John Martin
Tuesday, 25 September 2007

The following definitions are the most common ones found in articles written about Latent Semantic Indexing. If your goal is to rank higher in today's search engine environment, it is important to understand the new concepts being used. This is a "work in progress" as these techniques are continually evolving.

Latent Semantic Indexing: is the latest attempt by Google, and other search engines, to rank websites based on a more natural, or human, approach. Instead of rating sites using keyword density and links, more weight is now being given to sites that are based around a central theme.

Latent:  that which is present but cannot be seen.

Semantic: relating to the different meanings of words.

Analysis: an investigation to determine essential features and relationships.

LSI: the abbreviation often seen for Latent Semantic Indexing

Importance: LSI is here to stay so new optimization techniques must be used which conform to this new technology if higher search engine rankings are to be obtained.

Keywords and Themes

Keyword Stuffing: The practice of using one, or more, keywords over-and-over to trick the search engines into rewarding the website with a higher ranking.

For example: The man walked his dog to the dog park where he saw many other dogs. The overuse of the word "dog" is not natural and is only intended to help the site rank higher for the keyword "dog".

Keyword Density: the percentage a particular keyword is used in relation to all the words used.

For example: in the sentence above, the word "dog" is used three times out of a total number of fifteen words. The keyword density for the sentence is: 3/15, or 20%.

Themed website: a website built around a central theme using many inter-related keywords.

For example: a themed website about barbecues might be expected to also contain such words as "gas"," patio", "starter fluid", "charcoal", and "ribs".

Theme bleeding: a website which contains content not related to the website's central theme.

For example: a website about travel may contain articles about different countries which, in effect, distract from the website's core theme.

Importance: old optimization techniques of keyword stuffing and striving for a specific keyword density are no longer useful. Today a new approach is needed--websites must be constructed around themes instead of around individual keywords.

Word Terminology

Synonym: a word which has the same, or almost exactly the same meaning, as another word.

For example: car and automobile.

Polysemy: a word or phrase which has two or more separate meanings.

For example: "bank" could be a financial institution or a river bank, depending on the context it is used in.

Lexical Database: a database in which words are grouped into sets that relate to a distinct concept. WordNet has a large, free, English lexical database.

Importance: search engines are becoming smarter with the new LSI technology. At one time it was difficult for a search engine to distinguish between words with the same spelling but different meanings--a polysemy--but those days are over. Today websites must be built using a variety of keywords--synonyms, plurals, different tenses of verbs--anything that helps develop a central theme. The use of lexical databases and other tools can help collect these necessary terms and phrases.

Links

Inbound Links: a hyperlink on a separate website which points to your site.

Outbound links: a hyperlink on your website which points to a different website.

Reciprocal Linking: the mutual exchange of links between two websites. In the case of reciprocal linking your site would have both an inbound link from, and an outbound link to, a particular website.

Anchor Text: the clickable text in a hyperlink. If the anchor text is clicked on it will take you to a new site.

For example:  instead of a hyperlink in the form of http://www.LatentSemanticIndexing.com the anchor text could simply be: LSI

Importance: The days of reciprocal linking are over. It is now imperative to develop inbound links to a number of your site's internal pages while using a variety of different anchor text.

Search Methodology

Algorithm: a fixed list of distinct instructions to follow; a formula. The search engines use algorithms to determine which web pages they will return for a search made on a particular keyword or keyword phrase.

Boolean: a search method used by search engines which uses three logical operators, "or", "and", and "not".

For example: there could be a search for "dog", "dog and cat", or "dog or cat". This type of search does will return websites based on keywords, not themes.

Taxonomy: the science or technique of classification and categorization.

Dynamic Taxonomy: the process for searching and retrieving information from large, diverse, databases.

Importance: these terms are mainly for reference and may be useful for understanding articles about search engine methodology.

Source: http://www.latentsemanticindexing.com