Measuring the performance of a biomedical digital library: web site, e-journals and databases.

Paolo Gardois
Dipartimento di Scienze Pediatriche e dell'Adolescenza - Università di Torino
Piazza Polonia, 94 - 10126 Torino (Italy)
gardois@pediatria.unito.it


Click here to see the
ppt presentation


Abstract

Measuring the usage of electronic resources is vital both for the collection development strategy and to plan an efficient training program for users.
This work aims to provide both a case study and a framework for future developments in measuring the performance of a digital library. The survey starting point has been the server log of the web site of the Pediatrics and Adolescent Medicine Library of the Turin University. The log has been analyzed from September December 1999, March to May 2000 and July 2000 to January 2001. Parameters taken into account include the number of hits, users and single IP addresses in the time period, most requested pages, errors, percentage of users accessing the site from inside or outside the institution's building. Besides, using both Altavista and Fast Search I have tried to determine which web sites point to the library.
The library also delivers access to 160 e-journals (2000-10-30). For 94 of them usage reports are available. Where possible, I have tried to assess the number of total users, the hits on table of contents and abstracts, the number of full text downloads both in HTML and PDF format.
Users of the web site also access the following databases: Medline, CINAHL, Cochrane Database of Systematic Reviews, Best Evidence (all delivered by OVID). The usage of these resources has been analyzed for the period October 2000 to April 2001 with particular attention to: number of users for each resource, structure of the queries performed by users, search by subject, author, title, journal, and boolean operators.

-------ooo-------

General introduction

In the academic and scientific context web-based digital libraries have acquired a growing importance during the last years. Many libraries have understood the importance of delivering services via the web, but after few years of experiments there is a strong need to end the "beta versions" of libraries' web sites.
Let's point to the following situations:

the web has contributed to develop a global marketplace for scientific information. Commercial providers, publishers, software houses now compete with libraries to deliver information in a disintermediated environment; 

in the public but also in the private sector the budgets for acquiring books, journals and databases have considerably decreased during the last decade, thus determining a need for money saving and focusing on fewer, more relevant resources 

users are confronted with many different tools for information retrieval, such as search engines, online catalogues of any kind, online databases. On the other hand, the mass of information is growing so rapidly as never before: this means, using Ranganathan' words "save the time of the reader". Thus, users can easily judge an interface "too difficult to use", or a web resource "not interesting enough".

Libraries' websites, then, have to concentrate on quality - simply in order to survive. Offering basic information such as opening hours or colourful photographs of reading rooms is not enough anymore. And quality has two sides, in my opinion: global and local. Global: being part of cooperative projects such as union catalogues, indexing initiatives, consortia and deliver access to resources whose importance is widely acknowledged. Local: concentrating strongly on local users, tayloring services on their needs, and try to acquire new users - yes, even in the scientific environment, where users have usually been taken for granted until now.

Obviously, there is not only one strategy: marketing and advertising techniques, acquiring a more important role in users' education, integration with other departments working on the same issues (IT, for example) and with the scientific research (university press, assistance to authors), all are good examples of a right way to follow. There is a common need, though: receiving feedback from users, and building and restructuring the services offered paying particular attention to this feedback.

A web environment offers particularly favourable occasions to track users' behaviour. More or less every move of a user in a digital environment leaves traces. Mining and interpreting these data - something half way between Sherlock Holmes and Amazon's sales department - may be vital for the evaluation and the future development of a library's web site.
But there are also many difficulties. Raw data mean almost nothing if not placed in an appropriate context: one thing you learn from dull long lines of log files is that to interpret them correctly you need to know far better your users and their environment than the laws of statistics. Besides, data itself are not always easy to understand, as we will see in the log analysis section.

This is why I think that the experience of measuring our digital library for more than one year might be interesting: with its limits and errors, as much as with the parameters and the strategy of analysis.
 

Web server log analysis

Introduction

I have created the website of the Library of the Pediatric and Adolescent Medicine Department in August 1999. In June 2000 the site underwent a major reorganization involving both the graphical aspect and the architecture, although its content and target audience haven't changed significantly.

Technical data, services offered and typology of users are reported in table 1:

Table 1
 

Title of the web site Biblioteca del Dipartimento di scienze pediatriche e dell'adolescenza
URL <http://www.pediatria.unito.it/biblio/>
Total directories 9
Total files - only HTML and image files 67
Services offered: - information about the library 
- access to e-journals, catalogues and databases 
- biomedical internet resources described and indexed 
- information about seminars and other training events for clinicians and researchers 
- occasionally, web pages collecting teaching materials
Number of potential institutional users 1500 clinicians, nurses, other staff and professionals working in the OIRM S. Anna Maternal and Child Care Hospital of Turin (Italy).

Figure 1. The structure of the web site


 

I have collected and analyzed data from the web server hosting my library's web site for the following periods:
 

Period Days
1999-09-20 - 1999-12-13 85
2000-03-09 - 2000-05-04 56
2000-07-03 - 2000-12-01 152
2000-12-07 - 2001-01-31 56

Here it as an example of a line from the web server log:
 

212.210.172.4 - - [03/Aug/2000:15:31:55 +0200] 
"GET /biblio/ej/ejs.html HTTP/1.0" 200 35111

The sequence shows: the IP address of the user ; date and time ; file required ; code for "download successfully completed" ; number of bytes transferred.

The referrer function was not enabled: thus, it is not possible to have information about other sites pointing to this web site. Besides, this log does not indicate the user's browser, nor does it show whether a spider or robot has indexed the pages. This last function should be very useful to limit the analysis only to hits from "real" users.

To analyze the server log I have used:
Web Trends Log Analyzer v5.0e (1999) for general statistics, most requested pages, visitors
Mach 5 FastStatsAnalyzer v2.78 (2000) for geographical location of users and technical statistics.

Overview of the site: general statistics

Table 2 summarizes the most important data about the web site usage in the whole period considered, confronting it with the initial period (1999-09-20 - 1999-12-13) and the final period (2000-11-01 - 2001-01-31).

Table 2 - Generated On Monday April 30 2001 - 21:02:11
 

   
Whole period
09/20/1999 12/13/1999 11/01/2000 01/31/2001
Days   350 85 92
Hits Entire Site (Successful) 33004 2321 13620
  Average Per Day 94 27 148
  Home Page 6312 552 2406
Page Views Page Views 25970 1710 11012
  Average Per Day 74 20 119
Visitor Sessions Visitor Sessions 11798 595 4895
  Average Per Day 34 7 53
  Average Visitor Session Length (min:sec) 10:03 08:01 10:03
Visitors Identifiers (different IP numbers or domain names) 5008 192 2283
  Visitors Who Visited Once 4277 141 1942
  Visitors Who Visited More Than Once 731 (15%) 51(26%) 341 (15%)

Figures describe a web site significantly more used in terms of quantity after 13 months from the first measurement, but the length of visitor sessions has increased only very slowly. To help increase the time spent by users on the site, it would probably be necessary to add more content and a more complex organization.

The most requested pages and files

Graph 1

Not surprisingly, the most requested page is the home page, followed by the lists of e-journals and links to databases and other Internet resources. Basic information such as opening hours and address is not very required.

Additional statistics, not reported here, show that the home page is also the top entry page (35% of total) and the top exit page (24% of total).

Since the structure of the web site is very simple and not very deep (see tree in figure 1), the most accessed directories are the ones containing the most requested files (such as the home page or the pages commented in the previous paragraph).

Users: a profile: who, from where, when

All we can see from log files are domain names or IP addresses contacting our web server.
It is therefore impossible to determine exactly how many people have really visited the site in a given period. Indeed, we don't' know:

  • how many people are sitting in front of a computer browsing our site 
  • if a computer is used always and only by the same person, or if it is a shared computer (for example, one situated in a library or at an Internet point).

  • After that, who accesses the Internet through a Internet service provider (ISP) is assigned any time a different IP by the ISP (dynamic IP): this is why different IPs do not always mean different users.
    Besides, proxy servers hide the different IPs of single users.

    Another problem are "spiders" and robots, used by search engines and web directories to scan and automatically index web sites. In order to exclude spiders' hits from the total number of hits on your web server, it is necessary to enable the recognition of the user agent (usually declared by most common robots) on the server itself.

    All this given, let's pass on to some "positive definitions".
    A very important distinction is between hits and users. Every hit represents a single line in server log. It is a single request, either directed to a web page or to an image or to a downloadable file.
    A user, on the other hand, is defined by a single IP address or domain name accessing the site, and thus performing sometimes more than one hit. In this study - following a widespread convention - a visitor session is considered ended when the user hasn't performed any requests for more than 30 minutes.

    The duration of the visit is a very important parameter to study the interest of the user towards the web site. Also the percentage of users returning to the site helps us to establish if the site is used regularly or not: in general, if a user appreciates the service, he or she will come back often.

    Graph 2

    From this graph we understand that the majority of users (85,40%) haven't come back to the site after visiting it the first time. It is then necessary to study new strategies to attract the user with more peculiar content, which it is not possible to find easily elsewhere.
    This index should be carefully considered to study the fidelization of a particular audience, especially when preparing or evaluating marketing strategies.

    Graph 3

    Graph 3 shows the importance of another parameter: the length of a visit. The average length for this web site is 10:03 minutes. Anyway, 77 % of users has spent less than one minute on the site: the time to read the opening hours of the library, or to click away on the faculty's OPAC or Medline database, or... to decide that this is not the site they were looking for.

    Another important goal, then, is to reduce the percentage of users that spend less than two minutes on the site. "One click users" mustn't necessarily be discouraged, and short time spent on the site means also that you have rapidly found the information you were looking for. Increasing the number of users spending more time on the site, anyway, is an absolute priority.

    Graph 4

    A complementary parameter is shown in graph 4: the number of page viewed is proportional to the duration of the visits. In 63% of visits, users have downloaded only one web page from the site.

    Finally, two questions: which users who have visited the site most extensively, and where do they come from?

    Table 3
     

     
    Whole period
    09/20/1999 12/13/1999 11/01/2000 01/31/2001
    Days 350 85 92
    Hits from internal visitors 37% 62% 30%
    Hits from external visitors 63% 38% 70%

    One of the many possible analysis strategies is to determine how many users access the site from inside or outside the library's institution. From table 3, we see that in the first period most of the hits came from within the institution, while in the final period the situation is exactly the opposite: more than 2/3 of users access the site from outside the institution. They might not be other (new) users, though: this might simply mean that institutional users access the site also from home, for example.
    Obviously, it is possible to obtain many other significant statistical data using the list of visitors generated by a log analysis software.
    Another very obvious one is shown in graph 5:

    Graph 5

    To obtain such analysis it is necessary to have a reverse DNS lookup performed by the log analysis software connected to the Internet. This operation allows to obtain domain names from numeric IP addresses. It should be noticed that not every suffix corresponds to a geographical location: while .edu is usually referred to north american universities, suffixes such as .com or .net can correspond to a server located virtually anywhere.
    Since the language of the web site is Italian and the target audience is mostly Italian, obviously about 75% of users come from Italy. It is noticeable, though, that nonetheless more than 5% of hits come from foreign countries.

    Technical analysis

    These data are more interesting to the server administrator than to the librarian webmaster. They include number of visitor sessions by day of the week and by hour of the day: in order to prevent problems with the server, it is important to know when the traffic reaches considerable peaks. The same is true also for the quantity of data (bandwidth) transferred in certain days or hours. A text-based web site, without heavy graphics, helps reduce these problems.
    More relevant to the librarian are the data shown in table 4:

    Table 4
     

    Total Hits in the whole period 38703
    Successful Hits 33004
    Failed Hits 5699
    Failed Hits as Percent 15%

    Almost 15% of hits have resulted in an error message for the user. The great majority of errors are code 404 (page not found): this means either that the URL in a link (internal or external to the site) is not correct, or that the user hasn't typed in the correct URL in the browser, or that the requested page has been moved without leaving a notice that the change had taken place.
    Since these messages are very annoying and - if repeated - can prevent the user to return to the site again, they must be reduced with an accurate control and a systematic substitution of moved pages with automatic URL redirection or, at least, a page displaying the message "the page has moved to the new URL <...>".
    Another frequent code of error is 206 "Only partial content delivered (requested file not completely downloaded)". This problem can be caused either by an excessive length of the file (the user doesn't want to wait and then moves to another page before completing the download), or by network errors. A very obvious way to minimize the problem is to create shorter pages.

    Who points to our site? Using Altavista and Fast Search

    The best way to establish if the users of a site reach it from a link in another site or from a search engine is to enable the referrer function in the server log.
    A complementary mean is to use the advanced search function of some search engines, searching for the name of your web site in the links belonging to a web page indexed in the engine itself.

    Here are the results:

    Search with Altavista advanced search <http://it.altavista.com/s?spage=searchadv.htm> performed 2001-04-28 19.33:
    search string: "link:pediatria.unito.it/biblio AND NOT host:pediatria.unito.it/biblio" : 8 pages found

    Search with Fast Search advanced search <http://www.bos2.alltheweb.com/cgi-bin/advsearch> performed 2001-04-28 19.40:
    search string "Must include pediatria.unito.it/biblio/ in the link to the URL, Must not include pediatria.unito.it/biblio/ in the URL": 12 pages found.

    The use of such tools, though, usually doesn't give very clear results, since we don't know how many pages they index and to which geographical location these pages belong.

    E-journals

    Introduction

    The library's web site delivers access to 160 e-journals (2000-10-31), both through direct access to publishers sites and through packages such as EBSCO online and Ovid. The only data available come from Ovid and EBSCO gateways, since most of the publishers don't disclose to librarians the figures of online access for their users. As a result, 94 journals (45 through EBSCO online, 49 through Ovid) - 58.75% of total - were available for the analysis.
    Since the statistical analysis packages differ considerably, the parameters taken into account are quite different. This could be a case for a more uniform data format in data supplied by publishers for online journals usage.

    Data from OVID

    The Pediatric and Adolescent Medicine Library of the University of Turin accesses 49 online full text journals (SGML format) through the Ovid package, grouped in the Core Biomedical Collection, Nursing Collection and Nursing Collection II.
    The journals are available for all the computers in the network of the University's Pediatrics Department and Ginecology Department, located in the OIRM S. Anna hospital for maternal and child care in Turin.
    Thus, I was able to analyze the usage of the journals both by pediatricians and by ginecologists in the time period between 2000-10-25 and 2001-04-29.

    The number of potential users amounts to approximately 1000 people: clinicians, researchers, nurses and administrative staff.

    To analyze the usage data I have used Ovid Statistics Viewer, the statistical analysis package available with Ovid package local version 7.8.

    Table 5
     

    Title available since Hits on full text Hits on full text per month
    Lancet (The) 1993 117 19,5
    New England Journal of Medicine (The) 1993 76 12,7
    Pediatrics 1993 50 8,3
    Annals of Internal Medicine (The) 1993 43 7,2
    JAMA: Journal of the American Medical Association (The) 1993 43 7,2
    American Journal of Obstetrics & Gynecology (The) 1993 39 6,5
    BMJ 1993 38 6,3
    Nursing Standard  1995 34 5,7
    American Journal of Medicine (The) 1993 32 5,3
    MCN: American Journal of Maternal/Child Nursing  1996 25 4,2
    Journal of Obstetric, Gynecologic, and Neonatal Nursing (JOGNN)  1996 22 3,7
    Science 1993 21 3,5
    CMAJ  1993 19 3,2
    Journal of Advanced Nursing  1995 19 3,2
    American Journal of Nursing  1996 17 2,8
    AORN Journal  1995 15 2,5
    Circulation 1993 9 1,5
    Journal of Intravenous Nursing  1996 7 1,2
    American Journal of Surgery (The) 1993 6 1,0
    Cancer Nursing  1996 6 1,0
    Journal of Bone and Joint Surgery (American Volume) (The)  1993 6 1,0
    RN 1995 5 0,8
    AJIC (American Journal of Infection Control)  1995 4 0,7
    Archives of General Psychiatry (The) 1993 4 0,7
    Journal of Clinical Nursing  1995 4 0,7
    Journal of Clinical Investigation (The) 1993 3 0,5
    Nurse Educator  1996 3 0,5
    Nursing Case Management  1996 3 0,5
    Nursing Research  1996 3 0,5
    Advances in Nursing Science  1995 2 0,3
    Clinical Nurse Specialist 1996 2 0,3
    Heart & Lung: Journal of Acute & Critical Care  1995 2 0,3
    Image: Journal of Nursing Scholarship  1995 2 0,3
    Journal of Emergency Nursing  1995 2 0,3
    Nursing & Health Care Perspectives (May/June 1997-forward) continues N & HC: Perspectives on Community  1995 2 0,3
    Gastroenterology Nursing 1996 1 0,2
    Journal of Nursing Scholarship 1995 1 0,2
    Lippincott's Primary Care Practice (starts March/April 1997)  1997 1 0,2
    N & HC: Perspectives on Community (January/February 1995-March/April 1997) continued in Nursing & Health Care Perspectives  1995 1 0,2
    Outcomes Management for Nursing Practice (starts October/December 1997)  1997 1 0,2
    AACN Clinical Issues: Advanced Practice in Acute & Critical Care 1996 0 0,0
    AWHONN Lifelines (starts February 1997)  1997 0 0,0
    Computers in Nursing  1996 0 0,0
    Dermatology Nursing  1995 0 0,0
    Journal for Nurses in Staff Development (formerly Journal of Nursing Staff Development)  1996 0 0,0
    Journal of Nursing Administration (JONA)  1996 0 0,0
    Lippincott's Case Management 1996 0 0,0
    Nurse Researcher  1995 0 0,0
    Nursing Management  1995 0 0,0
        690 115

    Full text articles on these journals have been accessed 690 times in 6 months. The first ten journals, are medical journals, with only two exceptions. It should also be noticed that some journals from the list are freely available on the Internet (BMJ) and others (Lancet, New England Journal of Medicine, Pediatrics) were also available via the library's web site from the publisher's site.

    Particular attention should be paid to the fact that 9 journals out of 49 (18,4%) were not accessed at all, and 27 out of 49 (55,1%) were accessed less than 5 times. Also the overall usage of these resources was very low: 690 accesses for 1000 potential users in 6 months, at a rate of 0,115 accesses per potential user per month.

    Journals usage data have been divided by network (gynecological vs. pediatric) and by subject (medical vs. nursing).

    Let's consider now the usage per network and per subject category:

    Table 6
     

    Gynecology network
    Subject Total full text hits % on gynecology network full text hits
    Medicine 347 75
    Nursing 114 25
    Network total 461  
    Pediatrics network
    Subject Total full text hits % on pediatrics network full text hits
    Medicine 159 70
    Nursing 70 30
    Network total 229  

    E-journals were significantly more consulted in the gynecology network (66,81% on total full text hits).

    In both networks medicine journals were more read than nursing journals. These data are rearranged in table 7:

    Table 7
     

    Journals by subject Total full text hits % on total Average per month
    Medicine 506 73 84
    Nursing 184 26 31
    Total 690 100 115

    A thorough evaluation of these data is still due.
    Exact data on the potential users and a demographic profile supplying data about age, knowledge of English and cultural background are necessary, together with data on usage, to prepare an efficient training strategy.

    Data about EBSCO online

    45 journals are available via EBSCO online <http://www.ebsco.com/online/>. The articles are available in PDF o Real Page format. Usage data were analyzed on 2001-02-07 using the EBSCO Online usage report. Usage reports were available since 2000-03-05 to 2001-01-19 for one group and from 2000-02-22 to 2001-01-19 for another.

    Since the difference is slight (9 days out of 331) the journals have been merged into a single list only for statistical purposes.

    The potential users are estimated in approximately 1500, though no exact figure is available.

    These titles were available since the list of e-journals of the library's web site and were accessible either by IP recognition for computers inside the institution or by login and password outside the institution.

    Thus, it is important to notice that the possibility to access these titles is higher than for the Ovid ones, accessible only via IP recognition.

    Parameters used to analyse the use of these journals are more detailed as far as the single journal usage is concerned. It is possible to measure how many hits were on the journal, and how many on the tables of contents, abstracts and full text articles.

    Table 8 shows the titles sorted by number of full text article hits.
     

    Table 8
     

           
      Click here to see the table in rtf format  
       
           

     

    Conclusions

    The only comparable data (full text article hits) show that journals accessed through EBSCO Online were significantly more used (211 hits per month vs. 115). Considering a greater audience for EBSCO journals, though, the number of full text hits per user per month is only slightly higher (0,140 vs. 0,115).
    With regard to the subject grouping the difference increases, considering an average of 84 hits per months on OVID medical journals and 30 hits on Ovid nursing journals.

    To finish, just a note about the difficulty and the necessity of comparing paper and electronic journals.
    Web site statistics offer the possibility of tracking a user's reading behaviours in a way unimaginable in the "paper world": no librarian could ever have dreamed of knowing how many table of contents or abstracts a user would read, the length of reading sessions and so on.

    On the other hand, paper journals usage statistics tend to be less precise and detailed. Nonetheless, they must be compared in order to plan an efficient transition to a digital library from a hybrid one.
     

    Ovid Databases

    Overview

    Ovid databases are accessible to the same users and in the same way as the Ovid full text journals. Data on their usage were obtained through the same tool, Ovid Statistics Viewer. Here is the list of databases available:
     

    • CancerLit
    • CINAHL
    • EBM Reviews - Best Evidence
    • EBM Reviews - Cochrane Database of Systematic Reviews
    • Medline
    • Pre-Medline

    The two EBM Reviews are full text databases (to be precise, Best Evidence offers "enhanced abstracts"), the others bibliographic ones. Medline, Pre-Medline and CancerLit are also available free of charge on the Internet, and this might have affected their access through Ovid.
    The log files were analyzed in the same time period as the Ovid full text journals (2000-10-25 - 2001-04-29).

    Database usage in a time period

    The following table summarizes the main parameters considered for the overall usage of Ovid databases.

    Table 9
     

      Per day Per month
    Sessions 5,5 171
    Minutes of usage 133 4149
    Sets 36 1112
    Bibliographic records viewed 30 948
    Total full text viewed 1,2 39
    Total items viewed 32 987

    The same data are also available divided for the single databases. The highest figures in each column are highlighted.

    Table 10
     

           
     
    Click here to see the table in rtf format
     
       
           

    User's behaviour: patterns of searches

    In order to plan an efficient training program to improve user's ability to search on databases, it is vital to know their needs and search habits.

    Some tips on this subject could be offered by an analysis of the type of searches performed by users on our Ovid databases.

    The first columns of Table 11 show the number of search sets (single searches performed by a user during a session) the average search sets per session (which indicate the intensity of use of a database during a single session) and the percentage of search sets for a single database on the total of search sets for all databases.

    Medline turns out to be the most used database in general both in terms of sessions (684 out of 1025) and of total search sets (4746 out of 6674). The most intensively used database during a single session, though, is CINAHL (7,9 average sets per session, vs. 6,9 for Medline).

    Author, journal name and title searches are generally performed only by few users, with the exception of Pre-medline, where author searches are 14% of the searches performed in the database. A good percentage of users, instead, performs subject searches (43% for all databases, with a figure of 54% in CINAHL and 53% in Medline). This good percentage is also a consequence of the fact that Ovid performs an automatic mapping of a search string to MeSH terms in Medline and to CINAHL subject headings in CINAHL.

    The use of search operators is widespread for the best known ones: AND (23% of searches) and OR (30% of searches). Besides, the percentage of searches in AND or OR is underestimated due to the fact that - following strictly the path proposed by Ovid - users have to perform two separate searches and then to combine them in a third separate search selecting either OR or AND by a menu. In this menu, though, are not present other boolean operators such as NOT, ADJ or FREQ. This should account for the fact that only very few searches use the operator NOT (0,21% of total search sets) and almost none ADJ or FREQ.

    Table 11
     

           
      Click here to see the table in rtf format  
       
           

    Table 12
     

           
      Click here to see the table in rtf format  
       
           

    Conclusions

    A more thorough knowledge of the full potential of Ovid database is required, both for librarians and for users. As in every system, not always the most interesting features are also the most prominent - sometimes they are hidden somewhere or require direct typing of a string.

    As for users, a systematic training program should be built in order to teach the different possibilities offered by the various types of searches and operators.

    Obviously, figures and statistics don't tell everything. Another important preliminary step could be a questionnaire to assess if users usually find what they want or not (or in what percentage). The fact that a particular search was performed doesn't tell us if it provided the user with the correct result. Also a high number of searches performed per set could mean either that a user has successfully searched for different subjects or has encountered difficulties trying different search strategies for the same goal.

    A direct feedback from users is thus necessary: figures alone are not enough. 
     

    General Conclusions

    This study accounts only for a first experiment, with many limits both in method and in conclusions. No absolute truth can come out of it, but at least some general tips on how to begin to search for evidence about how users face digital libraries.

    Consequently, I would like to focus on two main issues that could be interesting for further research on the subject:

    Firstly, the need for internationally acknowledged guidelines for the analysis of this kind of data. Particularly, a set of basic parameters is strongly needed and an agreement on what each parameter means - let's think for example of the elusive meaning of the word "visitor". No rigid structures, but a set of tools to improve both the comparability of various analyses performed in different contexts. Besides, the set of parameters should have the necessary flexibility to be adapted to the particular needs of local communities of users.

    Secondly, a more uniform format for data. Pre-packed software for log analysis suffers from heavy limitations. What I have found abut the analysis of usage data for e-journals and databases are very rigid tools with almost no ease for personlization, and - what is more - no direct access to raw data. This last point is vital, though. Only studying directly the server logs and configuring them from the beginning in an appropriate way it is possible to obtain more precise results.

    Both issues need surely a wide international debate, as also J. Luther notes in her excellent White Paper on Electronic journal Usage Statistics (see bibliography). I hope my study could contribute to this debate.
     

    References

    1. Benjamin I, Goldwein JW, Rubin SC, McKenna WG. OncoLink: a cancer information resource for gynecologic oncologists and the public on the Internet. Gynecol. Oncol. 1996; 60:8-15.
    2. Cummings G, Sher A, Hayward R, Rowe BH. Survey of computer use in emergency medicine staff and residents. Acad. Emerg. Med. 2001; 1901; 8:473-a.
    3. D'Alessandro DM, Kreiter CD. Improving usage of pediatric information on the Internet: the Virtual Children's Hospital. Pediatrics 1999; 104: e55-99358130.
    4. D'Alessandro MP, D'Alessandro DM, Galvin JR, Erkonen WE. Evaluating overall usage of a digital health sciences library. Bull. Med. Libr. Assoc. 1998; SPR; 86: 602-9.
    5. D'Alessandro MP, Galvin JR. SPR online: creating, maintaining, and distributing a virtual professional society on the Internet. Radiographics 1998; 18: 189-94.
    6. Davis MJ, Wythe J, Rozum JS, Gore RW. Use of World Wide Web server and browser software to support a first-year medical physiology course. Am. J. Physiol. 1997; 272: S1-14.
    7. Dev P, Rindfleisch TC, Kush SJ, Stringer JR. An analysis of technology usage for streaming digital video in support of a preclinical curriculum. Proc. AMIA. Symp. 2000: 180-4.
    8. Luther, J. White Paper on Electronic journal Usage Statistics. Washington, DC : Council on Library and Information Resources, 2000. Also available at: http://www.clir.org/pubs/reports/pub94/contents.html
    9. Marshall WW, Haley RW. Use of a secure Internet Web site for collaborative medical research. JAMA 2000; 284: 1843-9.
    10. McVeigh S, Williams R, Pangaro L. An Internet home page to log students' patient contacts. Acad. Med. 1997; May. 72: 417-8.
    11. Peterson MW, Galvin JR, Dayton C, D'Alessandro MP. Realizing the promise: delivering pulmonary continuing medical education over the Internet. Chest 1999; 115: 1429-36.
    12. Quade G, Burde B, Zenker S, Goldschmidt A. [CancerNet online--a contribution to improving oncologic management]. Zentralbl. Gynakol. 2000; 122: 646-50.
    13. Roberts JR, Spooner SA. Pediatric Internet resources. Creation and growth of the PEDINFO index. Arch. Pediatr. Adolesc. Med. 1997; 151: 592-7.
    14. Spallek G, Kammann J. [The German Ophthalmology Society in the internet. Resources and interest]. Ophthalmologe 1999; 96: 398-404.
    15. Strasberg HR, Hubbs PR, Rindfleisch TC, Melmon KL. Analysis of information needs of users of the Stanford Health Information Network for Education. Proc. AMIA. Symp. 1999; 965-9.
    16. Zaidel M, Hopper K, Iyriboz T. Interactive web-based radiology teaching file. J. Digit. Imaging 1999; 12: 203-4.

    7. Credits

    The author wishes to thank C. Tortorelli, E. Gatto and G. Boiero for their invaluable help.