Written evidence submitted by the Association of Learned and Professional Society Publishers (ALPSP)

Summary

1. ALPSP broadly welcomes the Government’s response to the Hargreaves Review of Intellectual Property. We are, however, perplexed about the proposal and need for a "text and data mining" exception, as evidence of its requirement is lacking.

2. ALPSP is focussing the response to the BIS Inquiry on the proposed "text and data mining" exception. We will refer to this as content mining.

3. It is far too early to anticipate how content mining will develop. It is entirely possible that it will replace human-reading, becoming the normal exploitation of the work. An exception would therefore be in contradiction to the Berne Convention.

4. It is unclear what evidence there is to suggest there is a problem with content mining that requires an exception. A recently published study found that over 90% of publishers who had received requests for content mining were happy to oblige. Given the small number and infrequent nature of requests, the majority are handled on a case-by-case basis. Assumptions should not be made where a publisher may have refused a request – dialogue with publishers is required.

5. A survey of ALPSP members (small and medium publishers) indicate that of those who have received requests for text and data mining, all have been accommodating to those requests. Those who have not received requests (currently the majority) are understandably anxious about what effect a potential exception will have on their business.

6. The potential for a negative effect of such an exception on the UK Growth Agenda should be given careful consideration.

Introduction

 

7. ALPSP broadly welcomes the majority of the recommendations laid out in the Hargreaves Review and as accepted by the Government. We recognise the importance of and are already engaging in projects to resolve issues surrounding Orphan Works (for example, the ARROW Project1). We also look forward to participating in discussions surrounding the development of a "Digital Copyright Exchange" (DCE), which will be an important tool as part of a diligent search for owners of potential Orphan Works and in enforcing deliberate acts of copyright infringement.

8. As part-owners of the Publishers Licencing Society, ALPSP supports the British Copyright Council proposals on Principles of Good Practice for Collective Management Organisations. We support properly managed extended collective licensing schemes, where appropriate, that will further simplify the clearance of copyright permissions and provide remuneration to creators and other rightsholders.

9. We particularly welcome improvements in SME access to the IP system, including access to lower cost IP legal and commercial advice.

10. ALPSP submitted a response to the Hargreaves Review and therefore we will focus this response on an aspect not already covered, namely the proposed "text and data mining" copyright exception.

11. Digital Opportunity provided no evidence of the requirement for, nor an impact assessment of the effects of, such an exception. Indeed, it is likely to be far too early to be able to carry out a full and proper impact assessment, given the infancy of this market.

12. There is a common misconception that the Malaria Research example reiterated in Digital Opportunity would be solved by a "text and data mining" exception. It would not. This will be solved by the orphan works and Digital Copyright Exchange proposals, in conjunction with the already receptive attitudes of publishers to text mining requests (evidence for the latter in paragraphs 17-30).

13. Exactly what text and data mining is and its potential is far from clear. In addition to the primary example of finding previously un-noticed patterns in text and data, there is another potential for this technology, referred to in the National Centre for Text and Data Mining’s submission to the Hargreaves Review (Digital Opportunity, Supporting Document T, p7). It is very possible, given the current climate of technological developments, that "text and data mining" will replace the normal human reading and assessment of published works. This means that it will become the normal exploitation of the work and an exception would therefore be contrary to the Berne Convention.

14. Text and data mining are not well-defined at this stage. Do they mean the same thing? Or does text mining refer to published article content and data mining refer to the raw data that is held by researchers rather than publishers? For the remainder of this document, we shall refer to ‘content mining’ to mean the ‘content’ as produced by publishers, whether as a replacement for human reading or to find patterns. This is content that the publisher has invested in, adding value to enhance and optimise its trust, discovery, usage and preservation.

15. Content mining is an emerging field. It is a long way from reaching its potential, so much so, that it is difficult to assess the full effect it is going to have on the scholarly publishing market.

16. It is premature to suggest that this "market" has failed when it is still an emerging, embryonic market.

Evidence

17. A survey of ALPSP members has demonstrated that the majority of them (SMEs) have never had any request for text or data mining of their content. How then, can publishers be accused of blocking access to "medical data", when access for content mining has yet to be requested? This is further evidence that this is an emerging market, which has not had time to establish. How, then, can it be said to have failed?

18. The majority of ALPSP publishers are currently unable to ascertain the positive and negative effects that content mining may have on their business. It is too early to estimate this. It is too early to carry out a meaningful impact assessment.

19. ALPSP publishers who have yet to receive mining requests are obviously worried about resource issues in enabling appropriate access; they do not yet understand what it would entail. They are concerned about traffic to sites and servers being overloaded, whether they will have to produce content in a standard format, potentially requiring changes to their workflows, and how will they fund it.

20. The market is yet too young for this to have embraced all publishers and only a fraction of the overall market is currently involved.

21. Given the potential for content mining to become the primary exploitation of the work, an exception would mean that SMEs would be denied the potential to expand their business, directly contradicting the UK Government’s policy for Growth.

22. The few ALPSP members that have received requests to mine their content have all been receptive and facilitated the request. It is becoming a negotiated inclusion in licences with pharmaceutical companies; those who have paid to access the content publishers have invested in are facilitated in their requests to mine it.

23. Where is the evidence that content mining is being blocked by publishers? A recent study has demonstrated quite the opposite2 (attached to this submission). The study discovered that requests for content mining were, at the present time, found to be infrequent (reflecting the response from ALPSP members).

24. The study found that over 90% of "publishers tend to treat mining requests from third parties in a liberal way, certainly so for mining requests with a research purpose".

25. 28% of publishers surveyed already routinely allow content mining without restrictions as part of their Open Access policies.

26. Publishers are understandably less able to positively respond to requests for content mining where the purpose will compete with the original content. Where content mining is refused, it would be prudent to establish the reason for this, rather than to simply assume that publishers are being obstructive.

27. It is clear from this study that publishers are perfectly willing to licence subscription-based or Gold OA material for content mining (i.e. that access to the content they have invested in, is appropriately remunerated).

28. Content mining requires downloading of content to the user’s own systems. It is unreasonable to expect all publishers to have the bandwidth and server capacity to allow mining on their own sites. This means that content has to be reproduced and repurposed and thus would require a license. As already mentioned, publishers are receptive to requests for such licences. Indeed, some (larger) publishers are already providing their content in an offline format where necessary.

29. It should also be noted that 100% of article abstracts are already readily available for content mining.

30. The potential of an exception also raises the issue of where the content is published. Would the exception apply only in the UK, to UK-published material? This raises two very important points. 1) how do users distinguish between UK-published and non-UK published material? We are already being told that the UK copyright framework needs to be simplified for users – this would simply increase complexity unnecessarily, and 2) would this be an incentive for publishers to move their publishing away from the UK, thus undermining the UK’s growth agenda?

Conclusions

31. This evidence clearly shows that publishers are not, contrary to what others may have suggested, blocking access to content they have invested in, in response to reasonable requests for mining. As such, we fail to understand why an exception is necessary.

32. The proposed ‘text and data’ mining exception appears to be a knee-jerk reaction, which currently is not backed by evidence of its necessity. Indeed, the market is not even yet fully established.

33. Evidence shows that publishers who have received content mining requests understand its importance and the exciting possibilities it creates. The evidence available to date shows that publishers are positively responsive to content mining requests and are willing to facilitate them.

34. Publishers are more than happy to enter into dialogue with those who think there is a problem and to discuss how it can be resolved for the benefit of all. We wait to be invited.

Dr Audrey McCulloch

Executive Director, UK

On behalf of ALPSP membership

September 2011


[1] http://www.arrow-net.eu/

[2] Journal Article Mining. A research study into Practices, Policies, Plans and Promises. http://www.publishingresearch.net/documents/PRCSmitJAMreport20June2011VersionofRecord.pdf

Prepared 19th September 2011