IRList Digest Tuesday, 25 August 1987 Volume 3 : Issue 32 Today's Topics: Announcement - Abstracts from next ACM SIGIR Forum (part 4 of 4) News addresses are ARPANET: fox%vtopus.cs.vt.edu@relay.cs.net BITNET: foxea@vtvax3.bitnet CSNET: fox@vt UUCPNET: fox@vtopus.uucp ---------------------------------------------------------------------- Date: Mon, 10 Aug 87 15:17:43 CDT From: nancy@usl-vb.usl.edu (Nancy ) Subject: Abstracts from next ACM SIGIR Forum - sent by Raghavan ABSTRACTS (part 4 of 4) 30. ENHANCEMENT OF TEXT REPRESENTATIONS USING RELATED DOCUMENT TITLES G. Salton Department of Computer Science Cornell University Ithaca, NY 14853 and Y. Zhang Institute of Computer Technology China Academy of Railway Sciences Beijing, China Various attempts have been made over the years to construct enhanced document representations by using thesauruses of related terms, term asso- ciation maps, or knowledge frameworks that can be used to extract appropriate terms and concepts. None of the proposed methods for the improvement of document representation has proved to be generally useful when applied to a variety of different retrieval environments. Some recent work by Kwok suggests that document indexing may be enhanced by using title words taken from bibliographically related items. An evaluation of the process shows that many useful content words can be extracted from related document titles, as well as many terms of doubtful value. Overall, the procedure is not sufficiently reliable to warrant incorpora- tion automatic retrieval systems. (INFORMATION PROCESSING & MANAGEMENT, Vol, 22, No. 5, pp. 385-394, 1986) 31. A DISCIPLINE-SPECIFIC JOURNAL SELECTION ALGORITHM Chunpei He Institute of Information of Agricultural Science and Technology Chinese Academy of Agricultural Sciences Beijing, People's Republic of China and Miranda Lee Pao School of Library Science The University of Michigan Ann Arbor, Michigan 48109 An experiment was conducted to demonstrate the validity of a journal selection and ranking algorithm designed for any discipline. The charac- teristics of the journal generation procedure incorporate both cited and citing journals so that basic scientific research journals contributing to the research foundation of the discipline, as well as journals in the dis- cipline, might be identified. A Discipline Influence Score was proposed as a journal weight which could reflect the relative citation influence of each journal to the discipline under consideration. Two evaluation stu- dies showed that this method produced many journals which were perceived as frequently used journals by a group of American and Chinese profession- als in veterinary medicine. Journals with high Discipline Influence Scores were also selected by experts in their compilations of basic recommended lists in this discipline. In particular, the easy implementation of this journal selection algorithm appears to be of practical use to resource- poor libraries. (INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 5, pp. 405-416, 1986) 32. EMPIRICAL VALIDATION OF LOTKA'S LAW Paul Travis Nicholls School of Library and Information Science University of Western Ontario London, Canada N6G 1H1 Two modifications to the Pao procedure for testing Lotka's law are pro- posed and applied to 15 samples drawn from the humanities, social sci- ences, and sciences. (INFORMATION PROCESSING & MANAGEMENT, Vol. 22, No. 5, pp. 417-419, 1986) 33. ON RELEVANCE WEIGHT ESTIMATION AND QUERY EXPANSION S. E. Robertson Department of Information Science The City University London, EC1V 0HB A Bayesian argument is used to suggest modifications to the Robertson/Sparck Jones relevance weighting formula, to accommodate the addition to the query of terms taken from the relevant documents identi- fied during the search. (JOURNAL OF DOCUMENTATION, Vol. 42, No. 3, pp. 182-188, 1986) 34. SUBJECT ACCESS IN ONLINE CATALOGS: A DESIGN MODEL Marcia J. Bates Graduate School of Library and Information Science University of California at Los Angeles Los Angeles, CA 90024 A model based on strikingly different philosophical assumptions from those currently popular is proposed for the design of online subject cata- log access. Three design principles are presented and discussed: uncer- tainty (subject indexing is indeterminate and probabilistic beyond a cer- tain point), variety (by Ashby's law of requisite variety, variety of searcher query must equal variety of document indexing), and complexity (the search process, particularly during the entry and orientation phases, is subtler and more complex, on several grounds, than current models assume). Design features presented are an access phase, including entry and orientation, a hunting phase, and a selection phase. An end-user thesaurus and a front-end system mind are presented as examples of online catalog system components to improve searcher success during entry and orientation. The proposed model is ``wrapped around'' existing Library of Congress subject-heading indexing in such a way as to enhance access greatly without requiring reindexing. It is argued that both for cost reasons and in principle this is a superior approach to other design philosophies. (JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 37, No. 6, pp. 357-376, 1986) 35. WHY ARE ONLINE CATALOGS HARD TO USE? LESSONS LEARNED FROM INFORMATION- RETRIEVAL STUDIES Christine L. Borgman Graduate School of Library and Information Science University of California Los Angeles, CA 90024 Research in user behavior on online catalogs is in its early stages, but preliminary findings suggest that users encounter many of the same problems identified in behavioral studies of other types of bibliographic retrieval systems. Much can be learned from comparing the results of user behavior studies on these two types of systems. Research on user problems with both the mechanical aspects and the conceptual aspects of system use is reviewed, with the conclusion that more similarity exists across types of systems in conceptual than in mechanical problems. Also discussed are potential sources of the problems, due either to individual characteris- tics or to system variables. A series of research questions is proposed and a number of potential interim solutions are suggested for alleviating some of the problems encountered by users of information systems. (JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 37, No. 6, pp. 387-400, 1986) 36. PARALLEL FREE-TEXT SEARCH ON THE CONNECTION MACHINE SYSTEM Craig Stanfill and Brewster Kahle Thinking Machines Corporation 245 First Street Cambridge, MA 02142. A new implementation of free-text search using a new parallel computer - the Connection Machine - makes possible the application of exhaustive methods not previously feasible for large databases. (COMMUNICATIONS OF THE ACM, Vol. 29, No. 12, pp. 1229-1239, 1986) 37. A PROBLEM IN INFORMATION RETRIEVAL WITH FUZZY SETS Duncan A. Buell Department of Computer Science Louisiana State University Baton Rouge, LA 70803 In this note attempts to invent a system of weighted fuzzy queries, in which the weights would correspond to the relative importance of each term to the query as a whole are evaluated. (JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 36, No. 6, pp. 398-401, 1985) 38. ANOTHER LOOK AT AUTOMATIC TEXT-RETRIEVAL SYSTEMS Gerard Salton Department of Computer Science Cornell University Ithaca, NY 14853 An automatic text-retrieval system is designed to search a file of natural-language documents and retrieve certain stored items in response to queries submitted by a user. Evidence from available studies comparing manual and automatic text- retrieval systems does not support the conclusion that intellectual con- tent analysis produces better results than comparable automatic systems. (COMMUNICATIONS OF THE ACM, Vol. 29, No. 7, pp. 648-656, 1986). 39. SELF-ASSESSMENT PROCEDURE XV - FILE PROCESSING Martin K. Solomon and Riva Wenig Bickel Department of Computer and Information Systems Florida Atlantic University Boca Raton, FL 33431 This is the fifteenth self-assessment procedure. All the previous ones are listed on the next page. The first thirteen are available from the ACM* in a single loose-leaf binder to which later procedures may be added. This procedure deals with various aspects of file processing, including low level considerations and the design, use, and performance of access methods and file structures. A subtheme of the procedure, chosen for its illustration of a number of important points, is the contrast between file processing in the ``large scale IBM-like environment'' and in the ``Unix-like environment.'' * Order Number 203804. (ASSOCIATION FOR COMPUTING MACHINERY, Vol. 29, No. 8, pp. 745-750, 1986) 40. MAN-MACHINE INTERFACES: CAN THEY GUESS WHAT YOU WANT? Robert F. Simmons University of Texas at Austin Computer Science Department Austin, TX 78712 Help systems and menus make operating systems more friendly and useful, but can natural language interfaces guess what the user means but not specify? (IEEE EXPERT, Vol. 1, No. 1, pp. 86-94, 1986) ------------------------------ END OF IRList Digest ********************