Date: Sun, 7 Sep 86 16:00:44 edt
From: vtisr1!irlistrq
To: fox
Subject: IRList Digest V2 #39
Status: R

IRList Digest           Sunday, 7 September 1986      Volume 2 : Issue 39

Today's Topics:
   Call for Papers - Call for contributions to ACM SIGIR Forum
   Abstracts - Appearing in latest issue of ACM SIGIR Forum, Part 2

News addresses are ARPANET: fox%vt@csnet-relay.arpa  BITNET: foxea@vtvax3.bitnet
   CSNET: fox@vt   UUCPNET: seismo!vtisr1!irlistrq
----------------------------------------------------------------------

Date: Sun, 7 Sep 86 11:48:53 edt
From: fox (Ed Fox)
Subject: call for papers for ACM SIGIR Forum, fall 1986


It is time to gather short articles, book reviews, abstracts,
announcements, etc. for the next Forum.  I will be putting out
this issue, so send electronic versions (unless you say otherwise
they may appear in IRList too) or paper copies (done in camera
ready form, single spaced).

I look forward to receiving your materials in the next few weeks.
Many thanks, Ed Fox (co-editor for Forum).

------------------------------

Date:         Wed, 23 Jul 1986 13:06 CST
From:         Vijay V. Raghavan <RAGHAVAN@UREGINA1.bitnet>
Subject:      SIGIR FORUM Abstracts [Part 2 - Ed]

[Note: Members of ACM SIGIR should have received the spring/summer
 Forum, and can find these on pages 30-31. The rest will appear in
 machine readable form also in later issues of IRList. - Ed]

                            ABSTRACTS

(Chosen by G.  Salton or V. Raghavan from 1984 issues of journals
in the retrieval area)

10. TESTING  OF  A NATURAL LANGUAGE RETRIEVAL SYSTEM FOR  A  FULL
    TEXT KNOWLEDGE BASE

    Lionel M. Bernstein and Robert E. Williamson
    Lister  Hill National Center for  Biomedical  Communications,
    National Library of Medicine,  National Institutes of Health,
    Bethesda, MD  20209

    "A Navigator of Natural Language Organized Data" (ANNOD) is a
    retrieval   system  which  combines  use  of   probabilistic,
    linguistic, and empirical means to rank individual paragraphs
    of full text for their similarity to natural language queries
    proposed  by  users.   ANNOD includes common  word  deletion,
    word  root  isolation,  query expansion by a  thesaurus,  and
    application   of  a  complex  empirical  matching   (ranking)
    algorithm.   The  Hepatitis  Knowledge Base,  the text  of  a
    prototype information system,  was the file used for  testing
    ANNOD.   Responses to a series of users' unrestricted natural
    language   queries   were   evaluated   by   three   testers.
    Information  needed  to answer 85 to 95% of the  queries  was
    located  and displayed in the first few selected  paragraphs.
    It  was  successful  in  locating  information  in  both  the
    classified  (listed  in Table of Contents)  and  unclassified
    portions  of  text.   Development  of this  retrieval  system
    resulted from the complementarity of and interaction  between
    computer   science  and  medical  domain  expert   knowledge.
    Extension  of  these techniques to larger knowledge bases  is
    needed to clarify their proper role.

    (JASIS, Vol. 35(4): 235-247; 1984)

11. A  COMPARISON  OF  THE COSINE CORRELATION  AND  THE  MODIFIED
    PROBABILISTIC MODEL

    W. Bruce Croft
    Computer and Information Science Dept.
    University of Massachusetts
    Amherst, MA  01003

    It  has  been  pointed out that the  comparison  between  the
    performance  of  the  cosine  correlation  and  the  modified
    probabilistic model was incomplete.   In particular, the term
    weights used for the cosine correlation were term frequencies
    within  the document text.    Salton has for some time used a
    term  weight known as 'tf.idf' in his  retrieval  experiments
    with  the  cosine correlation.   This weight consists of  the
    within  document term frequency (sometimes normalized by  the
    maximum   frequency)  multiplied  by  the  inverse   document
    frequency  weight.   Although the inverse document  frequency
    weight can be regarded as a product of the retrieval process,
    it has also been used as part of the indexing process in that
    the  weight  is  assigned  to  the  terms  in  the   document
    representatives.   In this note, we shall present the results
    of  retrieval experiments with the cosine correlation and the
    tf.idf  weights.   The comparison of these results  to  those
    obtained  with the modified probabilistic model leads to some
    interesting conclusions about the cosine correlation.

    (Information Technology, Vol. 3 No. 2 113-115, April 1984

12. SCIENTIFIC INQUIRY:  A MODEL FOR ONLINE SEARCHING

    Stephen P. Harter
    School   of   Library  and   Information   Science,   Indiana
    University, Bloomington, IN  47405

    Scientific   inquiry  is  proposed  as  a  philosophical  and
    behavioral  model  for  online  information  retrieval.   The
    nature   of   scientific  research  concepts   of   variable,
    hypothesis formulation and testing,  operational  definition,
    validity, reliability, assumption, and the cyclical nature of
    research   are   established.    A  case  is  made  for   the
    inevitability of end-user searching.   It is argued that  the
    model  is  of  interest now only for its own  sake,  for  the
    intellectual  parallels that can be established  between  two
    apparently  disparate  human  activities,  but  as  a  useful
    framework  for  discussion and analysis of the online  search
    process from an educational and evaluative viewpoint.

    (JASIS, VOL. 35(2): 110-117; 1984)


13. A DRILL AND PRACTICE PROGRAM FOR ONLINE RETRIEVAL

    Bert R. Boyce
    School  of Library and Information Science,  Louisiana  State
    University, LA   70803

    David Martin, Barbara Francis, and Mary Ellen Slevert
    Department of Information Science,  University of Missouri at
    Columbia, 110 Stewart Hall, Columbia, MO  65211

    DAPPOR,  a  drill and practice program for  online  retrieval
    provides  reinforcement  to students engaged in learning  the
    basic command protocols of the major vendors of bibliographic
    databases.   The  DAPPOR  evaluation  program  overcomes  the
    difficult  problems of determining the correctness of a  user
    response  in a highly flexible environment.   The  coding  of
    answer  definitions  and the process of  recursive  reduction
    used by the evaluation program are described.

    (JASIS, Vol. 35(2): 129-134; 1984)


14. TWO PARTITIONING TYPE CLUSTERING ALGORITHMS

    Fazli Can and Esen A. Ozkarahan
    Arizona State University, Tempe, AZ 85287

    In this article,  two partitioning type clustering algorithms
    are  presented.   Both  algorithms  use the same  method  for
    selecting cluster seeds;  however, assignment of documents to
    the  seeds  is different.   The first algorithm  uses  a  new
    concept  called  "cover coefficient" and it is a  single-pass
    algorithm.   The  second one uses a conventional measure  for
    document assignment  to the cluster seeds and is a  multipass
    algorithm.   The  concept  of clustering,  a model  for  seed
    oriented partitioning,  the new centroid generation approach,
    and an illustration for both algorithms are also presented in
    the article.

    (JASIS, Vol. 35(5): 268-276 1984)


15. ARTIFICIAL  INTELLIGENCE:   UNDERLYING ASSUMPTIONS AND  BASIC
    OBJECTIVES

    Nick Cercone
    Computing   Science  Department,   Simon  Fraser  University,
    Burnaby, British Columbia, Canada  V5A 1S6

    Gordon McCalla
    Department   of   Computational   Science,    University   of
    Saskatchewan, Saskatoon, Saskatchewan, Canada S7N 0W0

    Artificial  intelligence (AI) research has recently  captured
    media  interest  and  it is fast becoming  our  newest  "hot"
    technology.   AI is an interdisciplinary field which  derives
    from a multiplicity of roots.  In this article we present our
    perspectives   on   methodological   assumptions   underlying
    research  efforts in AI.   We also discuss the goals  (design
    objectives)   of  AI  across  the  spectrum  of  subareas  it
    comprises.   We conclude by discussing why there is increased
    interest in AI and whether current predictions of the  future
    importance of AI are well founded.

    (JASIS Vol, 35(5): 280-290; 1984)


16. NATURAL LANGUAGE PROCESSING

    Ralph Grishmman
    Department  of  Computer Science,  New York  University,  251
    Mercer Street, New York, NY 10012

    Natural language processing has two primary roles to play  in
    the  storage  and retrieval of large bodies  of  information:
    providing a friendly, earily-learned interface to information
    retrieval  systems,  and  automatically structuring texts  so
    that  their  information  can be more  easily  processed  and
    retrieved.   This  article  outlines the  organization  of  a
    natural language interface for data retrieval (a "questions -
    answering  system") and some of the approaches being taken to
    text  structuring.   It  closes by describing a  few  of  the
    research   issues   in   computational  linguistics   and   a
    possibility for using interactive natural language processing
    for information acquisition.

    (JASIS, Vol. 35(5): 291-296; 1984)

17. EXPERT SYSTEMS: A TUTORIAL

    N. Shahla Yaghmai
    School  of  Library and Information  Science,  University  of
    Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53102

    Jacqueline A. Maxin
    Computer Services, The H.W. Wilson Company, Bronx, NY 10452

    Expert systems are intelligent computer applications that use
    data,  a  knowledge  base,  and a control mechanism to  solve
    problems  of  sufficient difficulty  that  significant  human
    expertise  is necessary for their solution.   Expert  systems
    use  artificial  intelligence problem-solving and  knowledge-
    representation  techniques to combine human expert  knowledge
    about   a   problem  area  with  human  expert   methods   of
    conceptualizing and reasoning about that problem area.   As a
    result, it is expected that such systems can reach a level of
    performance  comparable  to  that  of a  human  expert  in  a
    specialized problem area.   The high-level knowledge base and
    associated control mechanism of expert systems are in essence
    a  model  of the expertise of the best practitioners  of  the
    problem area in question and, hence, human users are provided
    with  expert opinions about problems in  that  area.   Expert
    systems  do not pretend to give final or ultimate conclusions
    to  displace  human decision making;  they are  intended  for
    consulting purposes only.

    (JASIS, Vol. 35(5); 297-305; 1984)

18. APPROACHES TO MACHINE LEARNING

    Pat Langley
    The   Robotics   Institute,    Carnegie-Mellon    University,
    Pittsburgh, PA 15213

    Jaime G. Carbonell
    Department  of Computer Science,  Carnegie-Mellon University,
    Pittsburgh, PA 15213

    The field of machine learning strives to develop methods  and
    techniques  to  automate the acquisition of new  information,
    new skills,  and new ways of organizing existing information.
    This article reviews the major approaches to machine learning
    in symbolic domains, illustrated with occasional paradigmatic
    examples.

    (JASIS, Vol. 35(5); 306-316: 1984)

19. ARTIFICIAL INTELLIGENCE: A SELECTED BIBLIOGRAPHY

    Compiled by Linda C. Smith
    Graduate   School   of  Library  and   Information   Science,
    University of Illinois at Urbana-Champaign, Urbana, IL 61801

    The  literature of artificial intelligence (AI) is  scattered
    over  many  books,   journals,  conference  proceedings,  and
    technical  reports.   This selected  annotated  bibliography,
    arranged by type of material, can serve as an introduction to
    that literature.

    (JASIS, Vol. 35(5); 317-319: 1984)

20. AUTOMATIC SEARCH TERM VARIANT GENERATION

    K. Sparck Jones and J. I. Tait
    Computer Laboratory, University of Cambridge

    The  paper  describes research designed to improve  automatic
    pre-coordinate  term indexing by applying  powerful  general-
    purpose language analysis techniques to identify term sources
    in  requests,  and  to  generate variant  expression  of  the
    concepts involved for document text searching.

    (Journal of Documentation,  Vol.  40,  No. 1, March 1984, pp.
    50-66).

21. HIERARCHIC  AGGLOMERATIVE  CLUSTERING METHODS  FOR  AUTOMATIC
    DOCUMENT CLASSIFICATION

    Alan Griffiths, Lesley A. Robinson and Peter Willett
    Department  of Information Studies,  University of Sheffield,
    Western Bank, Sheffield S10 2TN, UK

    This   paper  considers  the  classifications   produced   by
    application  of the single linkage,  complete linkage,  group
    average and Ward clustering methods to the Keen and Cranfield
    document test collections.   Experiments were carried out  to
    study  the  structure  of  the hierarchies  produced  by  the
    different  methods,  the extent to which the methods  distort
    the  input  similarity matrices during the  generation  of  a
    classification, and the retrieval effectiveness obtainable in
    cluster based retrieval.   The results would suggest that the
    single  linkage  method,  which has been used extensively  in
    previous  work  on  document  clustering,  is  not  the  most
    effective  procedure of those tested,  although it should  be
    emphasized that the experiments have used only small document
    test collections.

    Journal of Documentation, Vol. 40, No. 3, September 1984, pp.
    175-205.

22. PROBABILISTIC  AUTOMATIC  INDEXING  BY  LEARNING  FROM  HUMAN
    INDEXERS

    S. E. Robertson
    Department   of   Information   Science,   City   University,
    Northampton Square, London EC1V 0HB

    P. Harding
    Inspec,    Station   House,    Nightingale   Road,   Hitchin,
    Hertfordshire SG5 1RJ

    A  probabilistic model previously used in relevance  feedback
    is adapted for use in automtic indexing of documents (in  the
    sense  of  imitating human indexers).   The model  fits  with
    previous   work  in  this  area (the  'adhesion  coefficient'
    method),  in  effect  merely suggesting a  different  way  of
    arriving  at  the  adhesion coefficients.   Methods  for  the
    application  of  the model are  proposed.   The  independence
    assumptions  used  in  the model  are  interpreted,  and  the
    possibility of a dependence model is discussed.

    Journal of Documentation,  Vol. 40, No. 4, December 1984, pp.
    264-270.


------------------------------

END OF IRList Digest
********************