The research in CIR is concerned with
the study of retrieval methods and algorithms as well as with their
applications.
Development
of the associative interaction information retrieval method. The
associative algorithm is based on general generic network equation, and takes
advantage of the varying nature of the links between documents. It was shown
experimentally that its retrieval effectiveness overperforms that of classical
methods. Applications using the associative method have been develeoped, they can
be accessed from the CIR web site.
Development
of the HAT technology. The HAT is a complex methodology allowing
for the evaluation of retrieval effectiveness of retrieval applications both in
vitro and in vivo. The HAT technology was applied to the ’measurement’ of the
most popular Web search engines, and the NeuRadIR medical retrieval
application.
Creation of a
unified framework for the basic retrieval algorithms. A unified
formal framework for the Boolean, vector space, probabilistic, associative, and
PageRank retrieval methods was given. Thus, their application in practice has
become more effective and controlable; also, their teaching has become more
methodic.
Limits of
effectiveness enhancement. It was shown, using the concept of
effectiveness surface, that precision,
recall and fallout cannot be enhanced simultaneously at any desired extent,
only between certain limits that do not depend on the specific retrieval
algorithms used.
Entropy-based
TDV. A method based on entropy was developed for the
computation of term discrimination values. It was shown experimentally that
this method overperformed the earlier space density method.
Hyperbolic
information retrieval. A smilarity-based retrieval method was
developed in the Cayley-Klein hyperbolic space. It was shown theoretically and
experimentally that this method was equivalent to the cosine-based vector space
model but yielded the same categoricity in less time complexity.
Representativeness
of acronyms. A method to estimate the representativeness of acronyms
on the Web was developed based on reliability theory. It was shown that the
majority of the acronyms of Hungarian public institutions do not identify their
own institution when used as queries in search engines.