Reader Comments
Post a new comment on this article
Post Your Discussion Comment
Please follow our guidelines for comments and review our competing interests policy. Comments that do not conform to our guidelines will be promptly removed and the user account disabled. The following must be avoided:
- Remarks that could be interpreted as allegations of misconduct
- Unsupported assertions or statements
- Inflammatory or insulting language
Thank You!
Thank you for taking the time to flag this posting; we review flagged postings on a regular basis.
closereferencing raw data
Posted by jcbradley on 26 Jul 2007 at 20:11 GMT
If I understand correctly, in this article you are looking at article to article citations, right? Are you able to look at people citing raw data directly? Are all the data sets in the repositories you describe associated with published articles?
RE: referencing raw data
Cameron_Neylon replied to jcbradley on 31 Jul 2007 at 09:26 GMT
It is presumably very difficult to track in an objective way how the data was used i.e. whether the 'extra' citations actually looked at or used the raw data or rely on the text of the paper. Personally I would cite the paper if I had used the data in any case as a way of acknowledging and defining the source, even if I hadn't utilised the text of the publication in any way.
This is largely because there is no obvious identifier for the data (although this is less of an issue for microarray data). I guess there is a relationship here with Creative Commons licenses. A registered license could serve as an indentifier. Can a license have a doi?
Also be interested in whether this can be extended to other fields.
RE: referencing raw data
hpiwowar replied to jcbradley on 22 Aug 2007 at 15:57 GMT
Yes, the counts are of article-to-article citations. All of the data sets were associated with published articles.
The analysis in this paper does not include citations in the form of "We used the data from Accession Number X in Database Y" or "We used the microarray data located at http://XYZ" if they do not also include an article citation. Nor does it include citations that are listed in a table or figure rather than the references section, as is sometimes done in meta-analyses.
The impact of these exclusions depends on point of view. It does make for a conservative estimate of re-use (and needless to say, one that is much easier to assemble without needing full text and natural language processing).
Do these examples cover what you meant by citing raw data directly?
If not, I'm interested in what you meant, could you please elaborate?