Digital Libraries and Virtual Universities

Edward A. Fox
Department of Computer Science
Virginia Tech
Blacksburg, VA 24061-0106, USA
fox@vt.edu
http://fox.cs.vt.edu

Invited presentation for
"Information research for
designing and planning virtual universities"
Seminar at
Centro Universitario de Investigaciones Bibliotecolgicas
Universidad Nacional Autnoma de Mxico
Cd. Universitaria, Mxico, D.F.
(Library and Information Research Center,
National University of Mexico)
August 11-15, 1997

1. Introduction

Universities need libraries. Virtual universities need digital libraries. From these observations we see that if we are to successfully develop virtual universities, it is important to understand about digital libraries, and to learn from experiences in applying digital libraries to improve education

1.1. Digital Libraries

In 1965, JCR Licklider explored the future of libraries, that would help deal with the information explosion through the use of computer and communication technologies [LICK65]. By 1997, digital libraries were becoming a practical reality [LESK97]. As electronic publishing technologies evolve, authors can create works that will go directly into digital libraries, rapidly adding quality works to the ever growing collection of materials available through the WWW.

In the period 1991-1993, interest grew in the U.S. regarding electronic or digital libraries [FOXE93c]. Early work set the stage for later studies, by considering interface issues, economic models, application of protocols like Z39.50 for interoperability and federated searching, and access to published journals [FOXE93a]. Investigators in the public and private sectors began to identify requirements, propose architectures, and develop systems [GLAD94]. By 1994 it became clear that a great deal of research was required for these to be intelligent and useful [FOXE94a], and the U.S. led other nations into extensive funding through the NSF/DARPA/NASA Digital Library Initiative (see http://dli.grainger.uiuc.edu/national.htm). Scores of projects and studies proceeded worldwide [FOXE95a].

1.2. Overview of Two Projects

At Virginia Tech, there has been long standing interest in information retrieval, library automation, multimedia, electronic publishing, and since 1991, digital libraries. Some of this work is based in the NSF-supported Information Access Laboratory [FOXE96a].

Starting in 1987, Virginia Tech became involved in exploration of electronic theses and dissertations, working with SoftQuad in 1988 to develop a Document Type Definition to facilitate SGML encoding. Pilot efforts continued [DALA93], eventually leading to a large worldwide initiative discussed in Section 2 below. If large numbers of theses and dissertations become accessible electronically, they should play a strong supporting role for distance education, especially at the graduate level, and help extend the possibilities of virtual universities.

In 1991, Virginia Tech received support from NSF to pilot test digital library concepts in the computer science area, with assistance from ACM. By 1997, leading computing associations began offering their publications through digital libraries: ACM at http://www.acm.org/dl/ and IEEE CS at http://computer.org/epub/, and many others with similar plans. In the realm of technical reports, computer science efforts [FOXE95b] harmonized in 1995, leading to the widely used Networked Computer Science Technical Report Library at http://www.ncstrl.org. In 1993, additional NSF support was provided to Virginia Tech to apply digital library technology to improve computer science education; this is discussed in Section 3 below [FOXE95c, FOXE96b].

2. Networked Digital Library of Theses and Dissertations

In 1996, development began on the Networked Digital Library of Theses and Dissertations (NDLTD - http://www.ndltd.org), funded by Southeastern Universities Research Association, US Dept. of Education, IBM, Adobe, Microsoft, and others. In addition to the extensive online documentation, there is an article in D-Lib Magazine [FOXE96c] at http://www.dlib.org/dlib/september96/theses/09fox.html, as well as a videotape, CD-ROM, and other supplemental information.

Key efforts have focused on training students, working out policies and approval mechanisms so that student and faculty interests are represented, working with publishers and publisher associations to harmonize the evolution of complementary genre for scholarly publication, automating the submission and workflow processes for handling theses and disserations, and implementing solutions for federated digital library access.

From these efforts we have learned:

  1. universities can and are forming the foundation of a distributed service that will aid (graduate and other) education;
  2. universities have significant infrastructure to make this possible, though there will continue to be need to maintain and expand that infrastructure;
  3. making this work even more effectively requires solving problems that might be classified as relating to building virtual digital libraries.

Since there are online talks about the project at http://www.ndltd.org/talks/index.htm, and since it is clear that having hundreds of thousands of electronic theses and dissertations will be of great value for virtual universities, this paper shifts to the second project area, about computer science.

3. Interactive Learning with a Digital Library in Computer Science

In 1993 principal investigators N. Dwight Barnette, Edward A. Fox (director), H. Rex Hartson, JAN Lee, and Clifford Shaffer began to transform computer science education at Virginia Tech by applying digital library and other related technologies, building on a rich campus infrastructure as well as the Blacksburg Electronic Village http://www.bev.net community networking effort.

3.1. Overview of Goals and Objectives

Key concepts of our project [FOXE95c] are to improve CS education by increasing interactivity and use of a digital library. The main objectives/accomplishments were to: expand the content and software (especially interfaces [FOXE93b, NOWE94, NOWE96, WAKE95]) initially developed with NSF support of our "Envision" digital library project, "A User-Centered Database from the Computer Science Literature" [HEAT95]; develop/apply algorithm visualization tools that are easy for instructors to use in supplementing courses, and feasible for students to work with as an aid to program development and debugging [YANG95, SHAF96a, SHAF96b]; incorporate use of specialized digital library systems like Netlib into related courses; add new courses related to human-computer interaction, multimedia, and a freshman level introduction to Networked Information; significantly change courses like "Computer Professionalism," to make use of interactivity (e.g., asynchronous online debates) and digital library support (e.g., adding to a large History collection); and apply the key concepts to improve other courses

3.2. Current Status and Accomplishments

In 1991 Virginia Tech began working with ACM through support from NSF on a "User-Centered Database from the Computer Science Literature" [HEAT95]. In 1993, Virginia Tech expanded its work on digital libraries to launch the NSF EI project, partnering with Norfolk State University, which has developed extensive sets of laboratory manuals. Over 45 courses are available through WWW, leading to over 5M accesses since 1995. There are several gigabytes of ACM publications available. Several courses have all the on-line materials required for self-study available, and new programs are under development for distance learning and continuing education. In the new multimedia course, there was a dramatic increase in bandwidth required for the 1996 offerings as compared to the 1995 ones, because of more images, digital audio, and digital video. Due to the development by Prof. Lee (editor of Annals of the History of Computing) of one of the largest repositories on computer history, with a unique image collection of the founders and early systems in our field, there is extensive additional traffic from throughout the nation. In 1996, with the help of NSF-funded digital video capture and editing facilties, audio annotations, digital video movies, and animations to show interactive applications were added. One of the courses developed under this effort, and extended through support from SUCCEED, is CS1604, an introduction to networked information. A self-study version of this course was finalized later in 1996, and is expected to be widely used throughout the Southeast and beyond by those interested in a freshman or beginner-level orientation to Internet, digital libraries, collaboration technologies, etc. This version has numerous audio and movie files to help learners, an automated real-time feedback facility (using our SGML-based QUIZIT tool [TINO96]), and a variety of illustrations and demonstrations. In one old and two new courses, we have adapted Keller's Personalized System of Instruction to our networked environment. Students proceed at their own pace, study on their own, get help through asynchronous communication with peers and instructors, and in general have much greater flexibility in learning. Many students prefer this type of course, and in the case of CS1604 we simply could not accomodate the demand any other way, in this time of scarce resources. However, students have requested that we add interim deadlines, since they tend to procrastinate and require help with time management - this seems to solve the major problem faced earlier.

3.3. Materials that Have Been Developed

One result of our effort is the prototype Envision system. Its interface, if ported to Java, and connected to Z39.50, could be a very convenient means for accessing a variety of bibliographic collections, as well as richer digital libraries. A second result is the content converted from ACM. The most convenient portion is several hundred articles from CACM available now for those with permission using the Dienst system. A third result is the software created in increase interactivity of learning: SWAN and QUIZIT. Finally, there are about 10,000 WWW pages of CS courseware.

3.4. Outreach

Ongoing collaboration with Norfolk State University (NSU) has led to an increase in the use of laboratories to aid learning of CS students at Virginia Tech, and adaptation of many of the Virginia Tech materials and tools for use at NSU. Another systematic extension has been facilitated by additional funding from NSF through Southeastern University and College Coalition for Engineering EDucation (SUCCEED). The SUCCEED Coalition Grant "Using Computers and Networked Information: Distance Learning with Networked Multimedia" is expanding through the use of digital video/audio tutorials, an alternative VRML (Virtual Reality Markup Language) interface, multiple graphic pathways, an interactive collaboration medium for synchronous communication, and an online interactive real time testing component. Outreach work with a number of universities in the region is underway to help them apply this course to help them deal with increased interest in this field, and to more closely approach full "Information Literacy."

3.5. Evaluation Activities

Our evaluation involves typical traditional methods, e.g., pre- and post-tests, surveys, and focus groups. We performed usability studies of tools we developed or applied, and used formative evaluation methods to refine both our tools and courseware. Yet, our project still requires additional approaches to evaluation. The investigators in our project are instructors who changed their allocation of time, behavior, pedagogy, course materials, and tools. To understand the effects of these changes, ethnographic practices are of great value -especially regarding use of asynchronous communication (i.e., online debates) [LAUG96]. Another shift in our evaluation has been to rely on network monitoring, logging, and analysis. Here we draw upon special tools for this purpose [ABRA95]. Part of this work has helped improve our quality of service through caching. The rest has helped us understand what students really do, what course materials are accessed, how use of multimedia effects network traffic, and how both remote and local accesses increase over time. There has been a gradual increase in both remote and total access counts, if we ignore the valleys occuring during mid-semester, summer, and end-of-year breaks.

3.6. Benefits Seen and Expected

In summary, we have developed tools, expanded our digital library systems and content, and built almost 5000 "pages" of WWW-accessible courseware, increasing the interactivity and quality of learning about computer science. Evaluation has shown that learning practices have changed, most students are happy with the emerging infrastructure and pedagogy, and there is steady growth in access to our server. Remote users now account for about one-third of the page requests, adding to the hundreds of students served locally. Our work has helped train hundreds of students, has aided the work of instructors interested in teaching courses for which we have developed useful materials, has developed tools (e.g., SWAN, QUIZIT) that can increase the interactivity of learning about computer science, and has helped with the construction of digital library systems and a content collection in CS (with ACM publications as well as technical reports).

4. Conclusions

The two digital library projects discussed serve as exemplars regarding how digital libraries can improve education. Both involve many sources of information, and because technical reports, theses and dissertations are prepared and archived in a distributed fashion, demonstrate how a virtual university service can evolve. We hope that these ongoing efforts themselves will be widely used, and that insights from these efforts will be helpful as similar projects arise.

5. References