5 DSpace repository usage statistics questions answered

February 3, 2020

•

News

Industry:

Open Repository DSpace Express Custom DSpace

What is the most valuable piece of research at my institution? Who is the most productive member of faculty? These are multifaceted questions that can't be answered by download and pageview figures. However, usage stats can support or break a number of hypotheses.

Should we look at DSpace SOLR statistics or Google Analytics?

You benefit from looking at multiple sources of usage data. Both validation of findings, as well evaluating results of technical improvements become more robust when you cross reference different sources of data.

While Google Analytics offers one of the most versatile administrator dashboards in the industry, its main goal and inception point is driving "conversions" on commercially oriented websites. Out of the box, it is unable to aggregate downloads of specific files, with their corresponding item page, author, DSpace collection or community.

As an administrator using Google Analytics, you can't get your hands on the actual "raw" usage data, including IP address information or other elements associated with a single pageview. The practices that Google does or doesn't apply to filter out robot or crawler traffic, are undisclosed until today.

In comparison, DSpace SOLR stats store the detailed usage events. For detecting robot traffic, Atmire and others use the COUNTER Robot user agent definitions, openly shared on https://github.com/atmire/COUNTER-Robots This doesn't make DSpace repositories more effective at identifying robots by definition.

However, the state of the art is that neither Google, nor any other party on the web, has a full proof "golden" standard for detecting and eliminating robot traffic, as the state of the art in developing these robots is evolving as well.

All in all, even if you are 100% sure that a certain download came from a human user, that doesn't tell you to which extent the user has actually read or used the material. So in that sense, the ability for download counts to completely replace citations or other metrics, as a metric for human usage, is very limited. A file download remains a limited proxy for human usage and scientific impact.

Need help with your baseline or statistics related issues?

For years, Atmire has worked with institutions around the globe on repository usage statistics. Contact us today to learn more about our Content and Usage Analysis module for DSpace, or to receive technical assistance in usage statistics related issues.

If you are not using DSpace as a repository solution yet, usage statistics is only one of the areas where DSpace excels. Contact us for more details.

Get in Touch

Thanks you! We've received your message.

We are currently experiencing very high numbers of contact form entries but nevertheless aim to get back to you as soon as possible. Replies to contact form entries from institutional or corporate email accounts (e.g. non-gmail, yahoo, ...) are actively prioritized.

Oops! Something went wrong while submitting the form.