|
MOLE is the technology developed by CINECA for Text Mining. It is available
as a additional service via EINS, but it is also available as a customised
service, whereby different document collections can be included. Data mining
or text mining is not limited to patent or medical information.
Its main function is to group documents into clusters on the basis of their
content, without requiring predefined categories. Membership within a
cluster is determined via linguistic and mathematical measures of "content
similarity".
The benefits of clustering are twofold:
- to provide a quick overview of a great deal of documents and its
interrelationships and
- to serve as a navigational aid for document retrieval.
Using MOLE broad searches can be narrowed down, step by step, by means of
the keywords that emerge from the Text Mining process (these are
automatically extracted form texts and are not manually assigned). This
eliminates the need of prior knowledge of the keywords necessary for
focusing the search and also favors the discovery of new topics, not already
accounted for in the "controlled vocabulary", or manually assigned keywords.
Documents are assigned uniquely to one defined cluster; clusters of similar
documents are clearly displayed along with the extracted keywords that
characterize them.
EPMOLE contains about 1.800.000 patent abstracts from 1990. The file will be
updated monthly.
MDMOLE contains about 6.000.000 abstracts from Medline. The file will be updated
monthly.
This is a special service of Cineca. Searching is presently only possible on
the basis of a fixed fee. Please contact your national centre for further
information. If you are interested in the inhouse version of Mole, please
contact directly Roberta Turra at Cineca.
|