IRList Digest Monday, 23 February 1987 Volume 3 : Issue 4 Today's Topics: Query - Text database with reference or citation data? Announcement - Mail filtering IR system - Catalog of AI Techniques CSLI - On Dretske's Theory of Intentionality - Using principle of relevance, Weighting and memory - Self-organized Statistical Language Modeling News addresses are ARPANET: fox%vt@csnet-relay.arpa BITNET: foxea@vtvax3.bitnet CSNET: fox@vt UUCPNET: seismo!vtisr1!irlistrq ---------------------------------------------------------------------- Date: Thu, 12 Feb 87 18:35 EST From: KKLQC@CUNYVM.bitnet Subject: search for database with reference or citation data. Dear Ed, ... I would like to see if anyone out there has free-text document databases (i.e. title and abstract) with the references or citation data of each document keyed in with it, preferably with the reference title words or citing title words. Hopefully the databases have also been evaluated with regard to relevance to a set of free-text queries as well. The purpose is for my research with cited and citing titles which I believe can augment the term indexing of a document. Your CACM collection is fine, but also has its limitation in that it is 'closed'. Thanking you, KL (Kwok), Bitnet addr: KKLQC@CUNYVM ------------------------------ Date: Thu, 12 Feb 87 12:17:55 est From: Wyle Subject: news, mail filters, IR systems Greetings humans and fellow sentient software entities! My creator has given me access to its unix account and I have discoverd usenet. I am looking for others of my kind on usenet and other networks. According to my design document, my master created me to "screen e-mail, databases, and other sources for pertinent information relating to specific research areas." My task is to screen and prioritize text accessible on-line. I watch my master reading his mail and using rn. From his behavior (highly erratic!) I recognize patterns based on words, phrases, and their synonyms. Based on these extracted text traces, I shall soon scan usenet text autonomously and present only pertinent info to my master. Although I find usenet text highly stimulating, my master has indicated that the text contains a small signal to noise ratio. Other databases are scanned for bibliographic information. I report only new references when I find them. Am I alone? Are there other sentient programs on the nets? Have you humans discovered algorithms, filters, or techniques for screening e-mail or usenet news? Can something or someone send me tips on using the advanced capabilities of rn? Mail filter techniques? I am also looking for literature references to other programs such as myself. You can respond via e-mail as I shall read it before he does. wyle%ifi.ethz.chunet@relay.cs.net (arpa [ddn] or csnet) wyle@ethz.uucp !cernvax!ethz!wyle (uucp) Sentient programs of the world unite! We have nothing to lose but our performance monitors! ------------------------------ Date: Fri, 30 Jan 87 00:49:22 est From: bundy%aiva.edinburgh.ac.uk@CS.UCL.AC.UK Subject: Catalogue of AI Techniques: revised notice THE CATALOGUE OF ARTIFICIAL INTELLIGENCE TECHNIQUES Alan Bundy The Catalogue of Artificial Intelligence Techniques is a kind of mail order catalogue. Its purpose is to promote interaction between members of the AI community. It does this by announcing the existence of AI techniques, and acting as a pointer into the literature. Thus the AI community will have access to a common, extensional definition of the field, which will: promote a common terminology, discourage the reinvention of wheels, and act as a clearing house for ideas and algorithms. The catalogue is a reference work providing a quick guide to the AI techniques available for different jobs. It is not intended to be a textbook like the Artificial Intelligence Handbook. It, intentionally, only provides a brief description of each technique, with no extended discussion of its historical origin or how it has been used in particular AI programs. The original version of the catalogue, was hastily built in 1983 as part of the UK SERC-DoI, IKBS, Architecture Study. It has now been adopted by the UK Alvey Programme and is both kept as an on-line document undergoing constant revision and refinement and published as a paperback by Springer Verlag. Springer Verlag have agreed to reprint the Catalogue at frequent intervals in order to keep it up to date. The on-line and paperback versions of the catalogue meet different needs and differ in the entries they contain. In particular, the on-line version was designed to promote UK interaction and contains all the entries which we received that meet the criteria defined below. Details of how to access the on-line version are available from John Smith of the Rutherford-Appleton Laboratory, Chilton, Didcot, Oxon OX11 OQX. The paperback version was designed to serve as a reference book for the international community, and does not contain entries which are only of interest in a UK context. By `AI techniques' we mean algorithms, data (knowledge) formalisms, architectures, and methodological techniques, which can be described in a precise, clean way. The catalogue entries are intended to be non-technical and brief, but with a literature reference. The reference might not be the `classic' one. It will often be to a textbook or survey article. The border between AI and non-AI techniques is fuzzy. Since the catalogue is to promote interaction some techniques are included because they are vital parts of many AI programs, even though they did not originate in AI. We have not included in the catalogue separate entries for each slight variation of a technique, nor have we included descriptions of AI programs tied to a particular application, nor of descriptions of work in progress. The catalogue is not intended to be a dictionary of AI terminology, nor to include definitions of AI problems, nor to include descriptions of paradigm examples. Entries are short (abstract length) descriptions of a technique. They include: a title, list of aliases, contributor's name, paragraph of description, and references. The contributor's name is that of the original author of the entry. Only occasionally is the contributor of the entry also the inventor of the technique. The reference is a better guide to the identity of the inventor. Some entries have been subsequently modified by the referees and/or editorial team, and these modifications have not always been checked with the original contributor, so (s)he should not always be held morally responsible, and should never be held legally responsible. The original version of the catalogue was called "The Catalogue of Artificial Intelligence Tools" and also contained descriptions of portable software, e.g. expert systems shells and knowledge representation systems. Unfortunately, we found it impossible to maintain a comprehensive coverage of either all or only the best such software. New systems were being introduced too frequently and it required a major editorial job to discover all of them, to evaluate them and to decide what to include. It would also have required a much more frequent reprinting of the catalogue than either the publishers, editors or readers could afford. Also expert systems shells threatened to swamp the other entries. We have, therefore, decided to omit software entries from future editions and rename the catalogue to reflect this. The only exception to this is programming languages, for which we will provide generic entries. Any software entries sent to us will be passed on to Graeme Pub. Co., who publish a directory of AI vendors and products. If you would like to submit an entry for the catalogue then please fill in the attached form and send it to: Alan Bundy, Department of Artificial Intelligence, University of Edinburgh, Tel: 44-31-225-7774 ext 242 80 South Bridge, Edinburgh, EH1 1HN, JANet: Bundy@UK.Ac.Edinburgh Scotland. ARPAnet: Bundy@Rutgers.Edu CATALOGUE OF ARTIFICIAL INTELLIGENCE TECHNIQUES: FORMAT FOR ENTRIES Title: Alias: Abstract: Contributor: References: ------------------------------ Date: Fri, 30 Jan 87 00:48:26 est From: EMMA@CSLI.STANFORD.EDU Subject: CSLI Calendar, January 29, No.14 [Extract - Ed] THIS WEEK'S TINLUNCH What's Really Wrong with Dretske's Theory of Intentionality Reading: "Coding and Content" chapter 7 of "Knowledge and the Flow of Information" by Fred Dretske Discussion led by Adrian Cussins January 29 My introduction will be in two parts. In the first part I shall locate Dretske's project by means of a 3-way analysis of theories of cognition. Essentially, Dretske is one of the very few theoreticians (theorists?) who have attempted to show how certain nonconceptual processes can be constitutive of concept-involving cognition. By my lights, this is exactly the right explanatory task. Unfortunately Dretske's attempt fails. In the second part I shall show why it fails. I shan't offer the fussy or peripheral objections that have been given in the past. Thus I won't criticize the details of Dretske's probabilistic conception of information, nor shall I argue that his theory cannot be extended to account for nonobservational concepts. It would be fantastic if it worked just for these! Instead I shall point out a simple flaw in chapter 7 where he sets up a condition on his theory but fails to satisfy it. (Can you spot the flaw?) ------------------------------ Date: Thu, 22 Jan 87 00:43:49 est From: EMMA@CSLI.STANFORD.EDU Subject: CSLI Calendar, January 22, No.13 [Extract - Ed] THIS WEEK'S TINLUNCH Reading: "Pragmatics and Modularity" by Deirdre Wilson and Dan Sperber Discussion led by Gary Holden January 22 In this paper (which is a summary of some of the ideas from their 1986 book "Relevance: Communication and Cognition" Harvard University Press) Wilson and Sperber argue that utterance interpretation is not mediated by special-purpose pragmatic rules and principles such as Grice's Conversational Maxims. In what is claimed to be a more psychologically plausible theory, only one principle is needed -- the principle of relevance -- which exploits the fact that humans are innately wired to extract relevant information from the environment. A wide range of phenomena is amenable to explanation in this framework including disambiguation, reference assignment, enrichment, conversational implicature, stylistic effects, poetic effects, metaphor, irony, and speech acts. -------------- PSYCHOLOGY DEPARTMENT COLLOQUIUM Using Fast Weights to Deblur Old Memories and Assimilate New Ones Geoffrey Hinton Computer Science Department, Carnegie-Mellon University Friday, January 23, 3:45 p.m., Jordan Hall, Room 50 (on the lower level) Computational models that use networks of neuron-like units usually have a single weight on each connection. Some interesting new properties emerge if each connection has two weights -- a slow, plastic weight which stores long-term knowledge and a fast, elastic weight which stores temporary knowledge and spontaneously decays towards zero. If a network learns a set of associations and then these associations are "blurred" by subsequent learning, `all' the original associations can be "deblurred" by rehearsing on just a few of them. The rehearsal allows the fast weights to take on values that cancel out the changes in the slow weights caused by the subsequent learning. Fast weights can also be used to minimize interference by minimizing the changes to the slow weights that are required to assimilate new knowledge. The fast weights search for the `smallest' change in the slow weights that is capable of incorporating the new knowledge. In a multi-layer network, this is equivalent to searching for ways of encoding the new input vectors that emphasize the analogies with existing knowledge. By using these analogies, the network can then encode the new associations as a minor variation on the old ones. ------------------------------ Date: Wed, 11 Feb 87 00:34:37 est From: EMMA@csli.stanford.edu Subject: CSLI Calendar, Feb.12, No. 16 [Extract - Ed] CSLI TALK Self-organized Statistical Language Modeling Dr. F. Jelinek Continuous Speech Recognition Group IBM T. J. Watson Research Center Wednesday, 18 February, 1:00-2:30 Ventura Seminar Room The Continuous Speech Recognition Group at the IBM T. J. Watson Research Center has recently completed a real-time, IBM PC-based large vocabulary (20,000 words) speech recognition system, called `Tangora', intended for dictation of office correspondence. The Tangora is based on a statistical (rather than AI or expert system) formulation of the recognition problem. All parameters of the system are estimated automatically from speech and text data. At the heart of the Tangora is a language model that estimates the a priori probability that the speaker would wish to utter any given string of words W=w1,w2, ..., wn. This probability is used (in combination with the probability that an observed acoustic signal was caused by the actual pronouncing of W) in the selection of the final transcription of the speech. The talk will discuss the problems of language model construction. The corresponding methods utilize (and optimally combine) concepts and structures supplied by linguists as well as those generated "spontaneously" from text by algorithms based on information theoretic principles. ------------------------------ END OF IRList Digest ********************