GEM logo GEM Complementary Services About Us
GEM, An information resource for the 21st century


 
Data and Text Mining
   
MOLE is the technology developed by CINECA for Text Mining. It is available as a additional service via EINS, but it is also available as a customised service, whereby different document collections can be included. Data mining or text mining is not limited to patent or medical information.

Its main function is to group documents into clusters on the basis of their content, without requiring predefined categories. Membership within a cluster is determined via linguistic and mathematical measures of "content similarity".

The benefits of clustering are twofold:
  1. to provide a quick overview of a great deal of documents and its interrelationships and
  2. to serve as a navigational aid for document retrieval.
Using MOLE broad searches can be narrowed down, step by step, by means of the keywords that emerge from the Text Mining process (these are automatically extracted form texts and are not manually assigned). This eliminates the need of prior knowledge of the keywords necessary for focusing the search and also favors the discovery of new topics, not already accounted for in the "controlled vocabulary", or manually assigned keywords.

Documents are assigned uniquely to one defined cluster; clusters of similar documents are clearly displayed along with the extracted keywords that characterize them.

EPMOLE contains about 1.800.000 patent abstracts from 1990. The file will be updated monthly.

MDMOLE contains about 6.000.000 abstracts from Medline. The file will be updated monthly.

This is a special service of Cineca. Searching is presently only possible on the basis of a fixed fee. Please contact your national centre for further information. If you are interested in the inhouse version of Mole, please contact directly Roberta Turra at Cineca.

   
EINS GEM logo The British Library Cineca Cobidoc