Latent Semantic Indexing

Latent semantic indexing (LSI) is an indexing and retrieval method that uses a mathematical technique called singular value decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words that are used in the same contexts tend to have similar meanings. A key feature of LSI is its ability to extract the conceptual content of a body of text by establishing associations between those terms that occur in similar contexts.

LSI is also an application of correspondence analysis, a multivariate statistical technique developed by Jean-Paul Benzécri in the early 1970s, to a contingency table built from word counts in documents.

Called Latent Semantic Indexing because of its ability to correlate semantically related terms that are latent in a collection of text, it was first applied to text at Bell Laboratories in the late 1980s. The method, also called latent semantic analysis (LSA), uncovers the underlying latent semantic structure in the usage of words in a body of text and how it can be used to extract the meaning of the text in response to user queries, commonly referred to as concept searches. Queries, or concept searches, against a set of documents that have undergone LSI will return results that are conceptually similar in meaning to the search criteria even if the results don’t share a specific word or words with the search criteria.


Read more about Latent Semantic IndexingBenefits of LSI, LSI Timeline, Mathematics of LSI, Querying and Augmenting LSI Vector Spaces, Additional Uses of LSI, Challenges To LSI, See Also

Other articles related to "words">latent semantic indexing, latent semantic, indexing, semantic":

Latent Semantic Indexing - See Also
... Latent semantic analysis Latent Semantic Structure Indexing Principal component analysis Correspondence analysis ...
Automatic Image Annotation - Some Major Work
... "Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs" ... Automatic linguistic indexing of pictures J Li and J Z Wang (2008) ... "Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach" ...
List Of Chatterbots - General Natural Language Processing Concepts
... language reading aid – Foreign language writing aid – Language technology – Latent semantic indexing – LRE Map – Natural language – Reification (linguistics ... Radev – ETBLAST – Explicit semantic analysis – Filtered-popping recursive transition network – Robby Garner – GeneRIF – Gorn address – Grammar – Context-free grammar (CFG) – Constraint grammar (CG ... Computer Corporation – Language model – Languageware – Latent semantic analysis – Latent semantic mapping – Legal information retrieval ...

Famous quotes containing the words latent and/or semantic:

    The latent causes of faction are thus sown in the nature of man.
    James Madison (1751–1836)

    Watt’s need of semantic succour was at times so great that he would set to trying names on things, and on himself, almost as a woman hats.
    Samuel Beckett (1906–1989)