IRList Digest           Tuesday, 9 December 1986      Volume 2 : Issue 68

Today's Topics:
   Query - Information resource management
   Discussion - bib/refer software development
   Abstracts - Knowledge-based software components catalogue
             - Tech Reports on IR from Virginia Tech in 1986
   CSLI - Quantified and referring NPs, pronouns anaphora

News addresses are ARPANET: fox%vt@csnet-relay.arpa  BITNET: foxea@vtvax3.bitnet
   CSNET: fox@vt   UUCPNET: seismo!vtisr1!irlistrq

----------------------------------------------------------------------

Date:    Wed, 3 Dec 86 22:27 EST
From:    V6M@PSUVM.bitnet
Subject: IRM notes, hints, research, rumors   ANYTHING
     
Dear Professor,
     
I have a PRESSING need for help with the buzz words "INFORMATION RESOURCE MANAG
EMENT ".   I'd appreciate any help that the group could give to me on this
topic, which was a hot term around 1978-81 but which seems to have died out.
SeeAlso.. Data Resource Management
          End User Computing
          Information Center (used incorrectly by DPMA types to be
                              used as a synonym for End User Computing)
RelatedTerm..Thesaurus
     
BTW...anybody using a thesaurus in building or querying commercial DBMS in
      a business, application environment????
     
HELP!!!!
     
Vince Marchionni 215 337 1400 ext 274

------------------------------
     
Subject: Re: Six character limit in indxbib [really bib/refer]
Date: 29 Nov 86 12:35:56 +1000 (Sat)
Message-Id: <10.533612156@mulga>
From: John Shepherd <seismo!munnari!mulga.OZ!jas>

>   In IRList Digest V2 #53, Mitchell Wyle <wyle@ethz.uucp> writes:
>
>   For starters, I shall use Unix's addbib, sortbib, roffbib, indxbib,
>   lookbib suite of programs.  The manuals say that one can change the
>   options of indxbib when it stems, stop lists etc.  I have found the list
>   of the 100 most common words (/usr/lib/refer/eign), but I can't
>   figure out how to change the stemming from 6 characters.

I perused the sources here and it looked like the 6 was hardwired into
the "mkey" program (which is the component of "indxbib" which scans the
bib files and finds the keys). It seems that "indxbib" passes any options
it gets straight on to "mkey", but I couldn't find any documentation on
"mkey". It does have a number of options (e.g. to set minimum number of
chars to consider in keys), but none to extend the key length.

>   In IRList Digest V2 #61, David Brown <library@prg.oxford.ac.uk> writes:
>
>   Any information on this would be very welcome here too - we are currently
>   using indxbib/lookbib etc. and the lower-level utilities they call as the
>   basis of our (small and rudimentary) online catalogue, but are having
>   problems with false drops caused by this 6 character limit.

We have done quite a bit of work with bibliographies here at the University
of Melbourne over the last few years. We initially started using the "refer"
system but after finding that it couldn't quite do what we wanted in some
places, we eventually switched over to Tim Budd's "bib" system, which we found
was more flexible and seemed easier to use than "refer". Note that the data
formats they use are (with a few minor exceptions) identical.

We also thought that "lookup" ("bib"s version of "lookbib") would form a
useful basis for an on-line retrieval system, and Isaac Balbin (whom you
may know from his Logic Programming Bibliography) wrote a nice interactive
front-end called "seebib" (it also knows how to deal with "refer" databases).
It allows you to scan forwards and backwards through a list of answers to
a bibliographic query, and to save some or all of the matches in files.

Considering their simplicity, both these systems do a remarkable job as
data retrieval packages. However, they still have a number of disadvantages,
not the least of which is the necessity to rebuild the entire index each
time a reference is added. Has anyone had any experience with systems with
similar functionality to bib/refer (particularly their usefulness as troff
pre-processors) but which use more general database systems?

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
John Shepherd
Department of Computer Science,
University of Melbourne,                CSNET: jas%mulga.oz@australia
Parkville, 3052,                        ARPA: jas%mulga.oz@seismo.css.gov
AUSTRALIA                               UUCP: ...!munnari!mulga!jas

------------------------------
     
Date: Fri Nov 28 12:48 EST 1986
From: fox
To: seismo!ihnp4!hoqam!wbf
Subject: reusability

Bill:

  There is an interesting technical report about software reusability
and IR that you might want to obtain.
	A Knowledge-Based Software Components Catalogue
	Ian Sommerville and Murray Wood
	Research report CS/ST/5/86
	Software Technology Research Group
	Department of Computer Science
	University of Strathclyde
	Glascow, Scotland
	Tel: 041-552 4400

Abstract:
   There is currently a growing interest in the reuse of previously
designed, coded, tested, and documented software, primarily for
reasons of economy and reliability.  One of the major problems in
moving to a paradigm based on reuse rather than re-invention is the
storage and retrieval of software components.
   In this paper we argue that conventional keyword based
techniques are insufficient as a method of describing software
components for storage and retrieval in a software components
catalogue.  Our approach to the problem of software component
description is based on an attempt to identify the basic concepts of
the software component domain and the relationships between those
concepts.  These concepts and their relationships can be represented
by what we have termed software function frames.  We describe a
prototype implementation of a cataloging system based on these ideas.

------------------------------
     
Date: Sat Nov 29 07:34 EST 1986
From: fox 
Subject: IR Related Tech Reports for 1986, Virginia Tech Dept of Comp. Sci.

Copies of the following may be ordered by sending to
	emtront%vt@csnet-relay.arpa
	emtront@vtcs1.bitnet
	Elizabeth Tront, Dept. of Computer Science, Virginia Tech
	Blacksburg VA 24061


A Comparison of Two Methods for  Soft Boolean Operator
Interpretation in Information Retrieval
	Edward A. Fox, Sharat Sharan
	TR 86-1
ABSTRACT
	Information retrieval systems generally are given Boolean logic
queries by users or search intermediaries, in order that an efficient and
effective search for relevant documents can be automatically carried out. 
Previous work has shown that an extended interpretation of Boolean queries
can dramatically improve search effectiveness.  Experimental evidence is
given on the relative performance of the p-norm method and a
parameterized fuzzy-logic approach suggested by Paice. Regression analysis
supports expected results of parameter settings and gives further insight into
why the p-norm scheme is superior.


A Knowledge-Based System for Composite Document Analysis and Retrieval:
Design Issues in the CODER Project
	Edward A. Fox, Robert K. France
	TR 86-6
ABSTRACT
	The CODER (COmposite Document Expert/Extended/Effective
Retrieval) Project aims at applying a variety of methods developed in the
realm of artificial intelligence to improve the performance of information
retrieval systems. Logic programming, expert systems, blackboards, user
models, natural language processing, and knowledge representation will be
applied to handle  a collection of more than three years of issues of the AIList
ARPANET Digest. This paper gives background, describes related work,
explains the design principles and architecture, and closes with future plans.


Architecture of an Object-Oriented Expert System for 
Composite Document Analysis, Representation, and Retrieval
	Edward A. Fox, Robert K. France
	TR 86-10
ABSTRACT
	The CODER project is a multi-year effort to investigate how best to
apply artificial intelligence methods to increase the effectiveness of
information retrieval systems. The use of individually tailored specialist
experts coupled with standardized blackboard modules for communication
and control, and external knowledge bases for maintenance of factual world
knowledge, allows for quick prototyping and flexibility under change.  The
system is structured as a set of communicating modules, designed under an
object-oriented paradigm, using TCP/IP, UNIX, Mu-Prolog, and C.


Expert Retrieval for Computer Message Systems
	Edward A. Fox
	TR 86-13
ABSTRACT
	This paper describes how information storage and retrieval and arti-
ficial intelligence methods can be integrated with modern computers and net-
works to provide access for broad classes of users to archives of electronic: 
mail, digests, and bulletin boards.  A status report is given on the COmposite
Document Expert/extended/effective Retrieval project, designed to employ
communities of experts, operating on multiple communicating computers, for
free text analysis, indexing, and retrieval.  Details of the document-type 
expert are included to illustrate the approach.


A Call for Integrating Advanced Information Retrieval Models with  
CD-ROM / Microcomputer Systems
	Edward A. Fox
	TR 86-14
ABSTRACT
	Recent advances in computer hardware and storage devices will allow
inexpensive personal systems to be used by individuals to rapidly access vast
collections of text.  Research into database management, artificial
intelligence, and information retrieval can be applied to develop advanced
retrieval systems.  Retrieval models based on browsing, extended Boolean,
vector, probabilistic, and artificial intelligence approaches have all been
advanced as more effective for searchers than conventional methods.  The
CODER project aims to integrate these techniques.  Ultimately it is hoped
that CD-ROM based information retrieval systems will be released with
many of the capabilities mentioned.


Building the CODER Lexicon: The Collins English Dictionary and 
Its Adverb Definitions
	Edward A. Fox, Robert C. Wohlwend, Phyllis R. Sheldon,
	Qi Fan Chen, and Robert K. France
	TR 86-23
ABSTRACT
	In order to support some of the processing desired in the CODER
(COmposite Document Expert/extended/effective Retrieval) project, and to
allow experimentation in information retrieval and natural language
processing, a lexicon was constructed from the machine readable Collins
Dictionary of the English Language.  Characteristics of the dictionary,
conversion from typesetter form to Prolog relations, and comparisons of the
result with an earlier effort for Webster's Seventh New Collegiate
Dictionary  are discussed.  Finally, a grammar for adverb definitions is
presented, together with a description of defining formula that usually 
indicate the type of the adverb.


Development of the CODER System: A Test-bed for Artificial
Intelligence Methods in Information Retrieval
	Edward A. Fox
	TR 86-40
ABSTRACT
	The CODER (COmposite Document Expert/extended/effective Retrieval)
system is a test-bed for investigating the application of artificial
intelligence methods to increase the effectiveness of information
retrieval systems.  Particular attention is being given to analysis
and representation of heterogeneous documents, such as electronic mail
digests or messages, which vary widely in style, length, topic, and
structure.  Since handling passages of various types in these
collections is difficult even for experimental systems like SMART, it
is necessary to turn to other techniques being explored by information
retrieval and artificial intelligence researchers.  The CODER system
architecture involves communities of experts around active
blackboards, accessing knowledge bases that describe users, documents,
or lexical items of various types.  Most of the lexical knowledge base
construction work is now complete, and experts for search and
temporal reasoning can perform a variety of processing tasks.  User
information and queries are being gathered, and the first prototype is
nearly complete.  It appears that a number of artificial intelligence
techniques are needed to best handle such common, but complex,
document analysis and retrieval tasks.

------------------------------
     
Date: Sat, 15 Nov 86 01:28:42 est
From: EMMA@CSLI.STANFORD.EDU
Subject: CSLI Calendar, November 13, No. 7 [Extract - Ed]

                           NEXT WEEK'S SEMINAR
        Quantified and Referring Noun Phrases, Pronouns Anaphora
                     Mark Gawron and Stanley Peters
                        November 13 and 20, 1986

   A variety of interactions have been noted between scope ambiguities
   of quantified noun phrases, the possibility of interpreting pronouns
   as anaphoric, and the interpretation of elliptical verb phrases.
   Consider, for example, the following contrast, first noted in Ivan
   Sag's 1976 dissertation.
      (1) John read every book before Mary did.
      (2) John read every book before Mary read it.  The second sentence
   is interpretable either to mean each book was read by John before
   Mary, or instead that every book was read by John before Mary read
   any.  The first sentence has only the former interpretation.
       The seminar will describe developments in situation theory
   pertinent to the semantics of various quantifier phrases in English,
   as well as of `referring' noun phrases including pronouns, and of
   anaphoric uses of pronouns and elliptical verb phrases.  We aim to
   show how the theory of situations and situation semantics sheds light
   on a variety of complex interactions such as those illustrated above.

   (This seminar is a continuation of the seminar held on November 13.)

------------------------------
     
END OF IRList Digest
********************