ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Revised

Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies

[version 2; peer review: 2 approved, 1 approved with reservations]
PUBLISHED 30 Sep 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Preclinical Reproducibility and Robustness gateway.

This article is included in the INCF gateway.

Abstract

Concern about the reproducibility and reliability of biomedical research has been rising. An understudied issue is the prevalence of sample mislabeling, one impact of which would be invalid comparisons. We studied this issue in a corpus of human transcriptomics studies by comparing the provided annotations of sex to the expression levels of sex-specific genes. We identified apparent mislabeled samples in 46% of the datasets studied, yielding a 99% confidence lower-bound estimate for all studies of 33%. In a separate analysis of a set of datasets concerning a single cohort of subjects, 2/4 had mislabeled samples, indicating laboratory mix-ups rather than data recording errors. While the number of mixed-up samples per study was generally small, because our method can only identify a subset of potential mix-ups, our estimate is conservative for the breadth of the problem. Our findings emphasize the need for more stringent sample tracking, and that re-users of published data must be alert to the possibility of annotation and labelling errors.

Keywords

Transcriptomics, misannotation, reproducibility, data quality, gene expression, mislabeling

Revised Amendments from Version 1

As noted by Hans van Bokhoven, in the original version the the number of studies with mislabelled samples was incorrectly indicated in the text as "31" instead of "32". This is now corrected.

See the authors' detailed response to the review by Leonard P. Freedman
See the authors' detailed response to the review by Levi Waldron
See the authors' detailed response to the review by Hans van Bokhoven

Introduction

Recent years have seen an increase in concern about the quality of scientific research, along with efforts to improve reliability and reproducibility1,2. These issues are highly relevant to genomics studies, which deal with complex and often weak signals measured genome-wide. In transcriptomics studies (our focus here), mRNA is extracted from samples and processed using microarrays or RNA-seq, followed by statistical analysis to identify patterns of interest (e.g. differential expression). Much work has been done to raise awareness of technical issues in such studies such as RNA quality3 and batch effects4 and many investigators are aware of the need to address them5. Alongside, a great effort was put into establishing guidelines for annotation standards of expression data into public repositories6.

A key step in many scientific experiments, which has received less attention, is the importance of maintaining an accurate correspondence between the experimental conditions or sources of the samples and the eventual data. Simply put, for the analysis to be valid, the samples must not be mixed up. If mix-ups are present but undetected, the conclusions of the analysis might be affected and pollute the literature, as well as create a lurking problem for those who re-use the data.

The obviousness of the need to avoid mix-ups suggests that investigators should be well aware of the risk, and take steps to reduce it, such as careful bookkeeping (e.g., permanent sample tube labels matched to data files). However, we recently became concerned that mix-ups might not be rare. Our concerns came to a head when we reanalyzed four publically available datasets of Parkinson’s disease subjects7. As part of our quality checks of the data, we examined expression levels of sex-specific genes (genes expressed only in males or in females), and compared these with the corresponding subject sex meta-data annotations from each of the papers. To our surprise, we found discordance between the sex predicted based on expression levels of sex-specific genes and the manuscript-annotated sex in two out of the four datasets7 (Supplementary Figure S1). This finding, and other anecdotal observations, led us to examine this issue more broadly.

Sex-specific genes are well-suited for this purpose. In genetics studies, genotypes of the sex chromosome are routinely used to identify mislabeled samples8,9, moreover, sex check is a built-in option for some of the dedicated software10. Given that genetic abnormalities resulting in disagreement between genotypic and phenotypic sex are rare11, any disagreements are very likely to stem from errors and may also be indicative of other dataset quality issues. Using such genes for quality checks of transcriptome data is not widespread practice, but it is well known that several X- and Y-linked genes show sex-specific patterns of expression. A limitation of this approach is that mix-ups that do not yield conflicting sex labels (e.g., swapping two female samples) cannot be detected. But at the very least the sex-specific-gene-based approach can provide a lower bound for the amount of mix-ups and if any are detected it should trigger a reassessment of the tracking of all samples in the study.

In this study, we focused on publically available human expression profiling experiments that included individuals of both sexes. To our surprise, we found strong evidence of mix-ups in nearly half of them. Importantly, for the vast majority of the studies we were able to validate that the disagreement between metadata- and gene-based sex is prevalent in the original manuscript. This indicates that the disagreements are not a result of erroneous sex description during data submission to public repository. An additional 10% of the studies have samples of ambiguous gene-based sex, suggesting the possibility of samples being mistakenly combined or other quality problems. While it is possible that a small number of the cases we identify are due to sex chromosome abnormalities, we regard the most likely explanation for most to be laboratory mix-ups or errors in the meta-data annotations. Our findings suggest a widespread quality control issue in transcriptomics studies.

Methods

Except where mentioned, data analysis was performed using the R/Bioconductor environment12,13. Source code for the analysis is available in a Github repository (https://github.com/min110/mislabeled.samples.identification). The archived version of the code at the time of publication can be accessed through Zenodo mislabeled.samples.identification: doi:10.5281/zenodo.60313.

We identified datasets containing sex information as experimental factors by searching the Gemma database14. Out of an initial 121 datasets we focused on 79 studies run on the Affymetrix HG-U1333Plus_2 and HG-U133A platforms as they have the same sex marker genes (GEO platform identifiers GPL570 and GPL96 respectively). The annotations in Gemma, which originate from GEO sample descriptions augmented with manual annotation, were re-checked against GEO, resulting in the correction of errors for 14 samples. Datasets that contained samples of only one sex, represented data from sex-specific tissues (e.g. ovary or testicle) or contained numerous missing values were excluded (nine datasets). A final set of 70 studies (a total of 4160 samples) met the criteria. Table 1 summarizes the data included and full details of each study are in Supplementary Table S1. Whenever possible, data were reanalyzed from .CEL files. The signals were summarized using RMA method from the Affymetrix “power tools” (http://media.affymetrix.com/partners_programs/programs/developer/tools/powertools.affx), log2 transformed and quantile normalized as part of the general Gemma pre-processing pipeline.

Table 1. Summary of discrepancies between the gene expression-based and annotated sex in human microarray datasets.

Unclassified samples are samples with disagreement between their classification using k-means clustering and the median expression of the sex specific probesets. Datasets were considered as “correctly annotated” only if they did not contain mismatched or unclassified samples. Eight of the datasets contained both mismatched samples and unclassified samples.

All DatasetsNon-cancer
Datasets
Cancer
Datasets
All SamplesNon-cancer
Samples
Cancer
Samples
Correctly annotated31 (44%)29 (53%)2 (13%)4043 (97%)2868 (98%)1175 (96%)
Mismatched32 (46%)24 (44%)8 (53%)83 (2%)58 (1.97%)25 (2.04%)
Unclassified15 (21%)7 (13%)8 (53%)34 (0.8%)11 (0.4%)23 (1.9%)
Total705515416029371223

Probeset selection: The male-specific genes KDM5D and RPS4Y1 are represented by a single probeset on both platforms included in our analysis. XIST is represented by two probesets on the GPL96 platform and by seven probesets on the GPL570 platform. With the exception of the 221728_x_at probeset, XIST probesets were highly correlated with each other, and negatively correlated with the KDM5D and RPS4Y1 expression in all of the datasets analyzed (Supplementary Figure S3). The poor-performing XIST probeset (221728_x_at) was excluded from further analysis. The final set was four probesets for GPL96 and eight probesets for GPL570.

Assigning gene-based (biological) sex to samples: The expression data for the selected sex markers were extracted from the normalized data for each dataset. For each of these small expression matrices, we applied standard k-means clustering (using the “kmeans” function from the “stats” package in R15 to classify the samples into two clusters. We assigned the two clusters as “male” or “female”, based on the centroid values of each of the probesets: specifically, the cluster with higher values of the XIST probesets centroids and a lower value of KDM5D and RPS4Y1 centroids was assigned as a “female” cluster. To identify samples with ambiguous sex, we calculated the difference between the median expression level of the XIST probesets and the median expression level of the KDM5D and RPS4Y1 probesets. We compared this difference with the cluster-based gender, and validated that the difference is positive for samples assigned as females and negative for samples assigned as males. We excluded 34 samples that showed disagreement in this comparison since they could not provide a conclusive result for the gene—expression-based sex. We note that 12 (35%) of these would have been assigned to a cluster contradicting their annotated sex if we had retained them.

Manual validation of the discrepancy between the gene-based sex and the meta-data-based sex: For all the cases where a discrepancy was found between the gene-expression-based sex and the meta-data-based sex, we manually examined the original studies to check if the mismatch was due to incorrect annotation of the sample during the data upload to GEO, or was present in the original paper. Since most of the manuscripts only contain summary statistics of the demographic data (13/32, Supplementary Table S2), direct sample-by-sample validation was not possible for most studies. For these studies we used the highest resolution level of group summary statistics, provided in the publication to validate that the data in the paper corroborate the data in GEO. In addition, for all of the datasets with mismatched samples, we manually evaluated the expression values of the relevant probesets using the GEO2R tool on the GEO website.

Confidence interval estimate for population proportion of studies with misannotated samples: We used the properties of the binomial distribution to compute the confidence interval for the population estimate of affected data sets using the “qbinom” function in R.

Analysis of Stanley Foundation datasets: CEL files and sample metadata were downloaded directly from the Stanley Medical Research Institute genomic database (https://www.stanleygenomics.org/stanley/). CEL files were pre-processed, quantile normalized and log2 transformed using the rma function from the “affy” package in R Bioconductor12,13.

Results

We identified a corpus of 70 human gene expression studies that had sample sex annotation (4160 samples in total) run on two platforms. We developed a simple robust method for classifying samples by sex based on three sex specific genes – XIST, RPS4Y1 and KDM5D. XIST (X-inactive specific transcript) is expressed from the inactive X chromosome and acts to silence its expression and thus, is only expressed in female subjects. KDM5D (Lysine (K)-Specific Demethylase 5D) and RPS4Y1 (Ribosomal Protein S4, Y-Linked 1) are both located on the Y chromosome, and thus are only expressed in male subjects. Although additional sex-specific genes exist, we determined that KDM5D, RPS4Y1 and XIST are the only sex-specific genes consistently showing high expression levels in the associated sex in all tissues. Our method assigns a predicted sex based on gene expression to each sample, which we refer to as “gene-based sex” (see Methods). We also performed a second analysis to identify samples where a gene-based sex could not be confidently assigned. Such samples might reflect technical problems, but could also be due to true biological effects; for example, XIST expression is altered in some cancers and in early stages of development16. We then compared gene-based sex to the sex according to the provided sample annotations (“meta-data-based sex”) for the 70 studies, seeking samples with disagreements. Figure 1 shows examples of studies with no discrepant samples (1A) and with discrepancies (1B). Similar plots for all datasets analyzed are shown in Supplementary Figure S2. All calls of discrepant or ambiguous sex were followed by manual confirmation.

828675e2-950e-457f-aab2-e19d488c9738_figure1.gif

Figure 1. Representative plots showing expression levels of sex-specific probesets.

Expression level of probesets representing the XIST (red), KDM5D (black) and RPS4Y1 (blue) genes. “MetaFemale” and “MetaMale” indicate the meta-data annotated sex of the samples and their total number in brackets. The “M” and “F” along the X axis indicates the gene-based sex of the samples, as determined by k-means clustering. Log2-transformed expression levels are plotted. (a) Representative dataset with no mismatched samples. (b) Representative dataset with two mismatched samples (highlighted with grey bars). Gene-based sex that contradicts the annotated sex of the sample is highlighted in bold at bottom.

828675e2-950e-457f-aab2-e19d488c9738_figure2.gif

Figure 2. Gene-based and metadata-based sex in four datasets of similar subjects from Stanley Array collection.

The heatmap represents z-transformed expression values of KDM5D, RPS4Y1 and XIST probeset in four datasets of microarray data from Stanley Array Collection cohort of subjects. The datasets are designated - Study1 AltarA, Study3 Bahn, Study5 Dobrin, Study7 Kato, in correspondence to their names on the Stanley collection site. Each column represents a subject and each raw represents a probeset. The four studies are represented on the left color bar on the side of the heatmap. The gene names corresponding to each probeset are shown by the right color bar on the side of the heatmap. Three of the studies – AltarA, Bahn and Kato were performed on the GPL96 platform on which XIST is represented by two probesets. The Dobrin dataset is on the GPL570 platform containing additional 5 XIST probesets, one of which was removed from the analysis. The annotated sex of each subject (metadata gender) is represented by the top color bar (females – pink, males – purple). Missing samples (samples that were excluded from the original studies) are shown in grey. Arrows point to the mismatched samples.

We found samples with a discrepancy between the meta-data sex information and the gene-based sex in 32/70 (46%) of the datasets (ambiguous samples excluded; summarized in Table 1; details in Supplementary Table S2). Although datasets containing mismatched samples were more prevalent among cancer datasets (53% vs 44%, cancer vs. non-cancer, respectively), the proportion of mismatched samples was similar in cancer and non-cancer samples (2.04% vs 1.97%; Table 1). This discrepancy might be explained by on average higher number of samples in cancer datasets from our corpus (Supplementary Table S1). As expected, the proportion of samples with ambiguous gene-based sex was much higher in cancer as compared to non-cancer samples: 23/1223 (1.97%) in cancer vs. 11/2937 (0.4%; Table 1). In total, 34 samples were flagged as ambiguous, though we note that 12/34 (35%) would have been signed to the discrepant sex by our method. Ambiguous samples were found in 15/70 (21%) of the studies (eight of which also contained mismatched samples).

Because the sex annotations we used to this point were obtained from the sample descriptions in GEO, there was a possibility that the discrepancies we identified were due to mistakes introduced during the communication of the data from the submitter to GEO. If this was the case, the results in the original publication (29/32 of the affected studies had an associated publication) would be unaffected, though users of the GEO data would still be affected. To check this possibility, we went back to the 29 original publications to see if the sex labels provided in the paper matched those in GEO (detailed in Supplementary Table S2). This check was not always possible because many publications did not provide detailed meta-data in the paper or Supplementary materials; GEO provides the only record. In 12/29 cases, sufficient detail was provided for us to confirm that the discrepant sex labels were present in the publication, and in all of them there was agreement between the meta-data in the publication and the meta-data in GEO. In 13 cases only summaries were given in the publication (e.g. “10 males and nine females in group X”). In 10 of these 13 studies, the summary counts in the publication agree with GEO. In the other three, both GEO and gene-based totals disagree with the publication-based totals. In other words, there seems to have been miscommunication with GEO in addition to a sex annotation discrepancy in the original study report. Finally, for four datasets meta-data was not provided or ambiguously described in the paper. We failed to find any unambiguous case in which we would infer the only problem was a miscommunication with GEO.

The analysis presented cannot distinguish between actual sample mix-ups (e.g., tube swaps) and errors in the meta-data (incorrect recording of the subject’s sex). Fortuitously, we identified data sets where it can be determined that at least in some cases, samples were probably physically mixed up. In addition to the 70 datasets used above, we analyzed four datasets that all used human brain RNA from the same collection of subjects (Stanley Medical Research Institute, Array Collection, https://www.stanleygenomics.org/stanley/). In this case the meta-data is common across the four laboratories since they are all analyzing the same individuals (though not all studies analyzed all the individuals). If the meta-data is incorrect, then all of the studies should show discrepancies for the same samples. If the samples were mixed up in a particular laboratory (or by the sample provider at the time they were sent to the laboratory), each study would have different discrepancies. We found that out of the four available datasets with data corresponding to the same subjects, two datasets contained mismatched samples (a single mismatched sample was identified in the “AltarA” study, and five in the “Dobrin” study; Figure 2). Importantly, the mismatched subjects differed between the datasets and samples from the same subjects appeared as correctly annotated in the other datasets. This suggests that the mismatched cases are likely to represent mislabeled samples rather than mistakes taking place during the recording of the subjects’ sex.

We were surprised that nearly 50% of studies had at least one labeling error, and were concerned that this might be an overestimate by chance, due to sampling error. To address this we computed confidence intervals for our estimate of the fraction of affected studies, yielding a 95%-confident lower bound of 36% and a 99% lower bound of 33% (upper confidence bounds were 56% and 60% respectively). We also note that our independent observations of 2/4 datasets containing misannotations described in Santiago et al.17 and in 2/4 of the Stanley data set are in agreement with a relatively high estimate. Thus we project that, with 99% certainty, if all expression studies in GEO could be checked for mix-ups based on sex-specific genes, the fraction affected would be at least 33%.

Discussion

Using a simple approach to compare sample annotations for sex to expression patterns, we found that nearly 50% of datasets we checked contain at least one discrepancy. Our findings are also in general agreement with another study that examined this issue in a cohort of predominantly cancer datasets18, although in cancer there is an expectation of more ambiguity of sex marker expression16. In the case of the Stanley brain datasets, we could determine that the problem is likely to stem from laboratory mix-ups rather than an error in recording the subject’s sex. While our analysis is limited to a corpus of studies where sex information was available along with the presence of good markers on the microarray platform, our data suggest a widespread problem.

What is the impact of this issue? Viewed optimistically, a single mixed-up sample is not likely to dramatically affect the conclusions of a well-powered study. In addition, our analysis suggests a lower (99% confident) estimate of “only” 33% of studies with a sex mislabeling, which might provide a small amount of comfort to optimists – it could be worse. However, the sample mislabeling we identified might be the tip of the iceberg, because sex-specific genes can only reveal mixed-up samples with differing sex. We also suggest that sample mix-ups might correlate with other quality problems. Indeed, many of the misannotated datasets we found have additional issues such as undocumented batch effects, outlier samples, other apparent sample misannotations (not sex-related), and discordance in sample descriptions reported in different parts of the relevant publication (Supplementary Table S2). The presence of samples with ambiguous gene-based sex in non-cancer samples is suggestive of even more quality problems. This is because expression patterns of sex-specific genes could be treated as a positive quality control for the expression data as a whole, serving as indicators for the reliability of other gene signals. Deviations from the expected pattern might indicate samples were mixed together, or suggest problems with RNA quality.

Our conclusions are two-fold. First, there is an alarming degree of apparent mislabeling of samples in the transcriptomics literature. In at least the specific cases we identified, the trust in the reliability of the findings reported is certainly not improved. Second, because it is simple to check the expression patterns of sex markers, the tests we performed should become a routine part of all omics studies where sex can be inferred from the data.

Data and software availability

All data analyzed in this manuscript were previously published and can be accessed through the GEO repository (http://www.ncbi.nlm.nih.gov/gds) using accession numbers indicated in Table S1.

Stanley Medical Research Institute data can be accessed through https://www.stanleygenomics.org/stanley/.

Source code for the analysis is available in a Github repository (https://github.com/min110/mislabeled.samples.identification).

The archived version of the code at the time of publication can be accessed through Zenodo mislabeled.samples.identification: doi:10.5281/zenodo.6031319

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 30 Sep 2016
Revised
  • Reader Comment 30 Sep 2021
    Anita Bandrowski, UCSD, San Diego, CA, USA
    30 Sep 2021
    Reader Comment
    Dear Paul,

    This is a very interesting read and I think that I now understand the data to some extent. It is amazing how a small error in one ... Continue reading
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Toker L, Feng M and Pavlidis P. Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies [version 2; peer review: 2 approved, 1 approved with reservations] F1000Research 2016, 5:2103 (https://doi.org/10.12688/f1000research.9471.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 30 Sep 2016
Revised
Views
57
Cite
Reviewer Report 03 Oct 2016
Levi Waldron, School of Public Health, City University of New York, New York, NY, USA 
Approved with Reservations
VIEWS 57
The article is clear and to the point, and the abstract provides an adequate summary of the article, the methods are adequately described, conclusions are balanced and justified, and data are available in a usable format. I do have a ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Waldron L. Reviewer Report For: Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2016, 5:2103 (https://doi.org/10.5256/f1000research.10437.r16396)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 29 Mar 2017
    Lilah Toker, Department of Psychiatry, University of British Columbia, Vancouver, V6T 2A1, Canada
    29 Mar 2017
    Author Response
    1. The decision to use sex genes was based on their unique properties – they are only expressed in subjects with a specific sex, and when expressed, their expression
    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 29 Mar 2017
    Lilah Toker, Department of Psychiatry, University of British Columbia, Vancouver, V6T 2A1, Canada
    29 Mar 2017
    Author Response
    1. The decision to use sex genes was based on their unique properties – they are only expressed in subjects with a specific sex, and when expressed, their expression
    ... Continue reading
Version 1
VERSION 1
PUBLISHED 30 Aug 2016
Views
45
Cite
Reviewer Report 22 Sep 2016
Hans van Bokhoven, Department of Human Genetics & Department of Cognitive Neuroscinders Institute for Brain, Cognition and Behaviour ences, Radboud University Medical Center, Nijmegen, The Netherlands 
Approved
VIEWS 45
This article by Toker et al. reports a retrospective biostatistical analysis of the data reported in 70 RNA expression studies to identify the possible misannotation of samples used in these studies. Their analysis is based on the expression of gender-specific ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
van Bokhoven H. Reviewer Report For: Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2016, 5:2103 (https://doi.org/10.5256/f1000research.10200.r16144)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 29 Mar 2017
    Lilah Toker, Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, Canada
    29 Mar 2017
    Author Response
    1. We thank the reviewer for his comment. It was indeed a typo and it was corrected in version 2 of the manuscript.
       
    2. We thank the
    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 29 Mar 2017
    Lilah Toker, Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, Canada
    29 Mar 2017
    Author Response
    1. We thank the reviewer for his comment. It was indeed a typo and it was corrected in version 2 of the manuscript.
       
    2. We thank the
    ... Continue reading
Views
47
Cite
Reviewer Report 13 Sep 2016
Leonard P. Freedman, Global Biological Standards Institute, Washington, DC, USA, Washington, DC, USA 
Approved
VIEWS 47
This is an excellent paper highlighting the importance of sample annotation as a critical contributor to reproducible research. Using transcriptomics of sex-specific gene expression levels as an example, the authors do a careful analysis to illustrate the issue of mislabeling. ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Freedman LP. Reviewer Report For: Whose sample is it anyway? Widespread misannotation of samples in transcriptomics studies [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2016, 5:2103 (https://doi.org/10.5256/f1000research.10200.r15949)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 29 Mar 2017
    Lilah Toker, Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, Canada
    29 Mar 2017
    Author Response
    We thank the reviewer for the kind words. The reason we believe this is ‘the tip of the iceberg’ is that we were only able to detect cases where the ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 29 Mar 2017
    Lilah Toker, Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, Canada
    29 Mar 2017
    Author Response
    We thank the reviewer for the kind words. The reason we believe this is ‘the tip of the iceberg’ is that we were only able to detect cases where the ... Continue reading

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 30 Sep 2016
Revised
  • Reader Comment 30 Sep 2021
    Anita Bandrowski, UCSD, San Diego, CA, USA
    30 Sep 2021
    Reader Comment
    Dear Paul,

    This is a very interesting read and I think that I now understand the data to some extent. It is amazing how a small error in one ... Continue reading
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.