Evaluate possible citation bias

Studies reporting significant associations between a complex phenotype and variation for a gene or other DNA element are common, but frequently do not yield robust results. Significance in one population collected by one lab may not be repeated in a different population of by a different research group. Thus, these types of studies require considering alternate hypotheses. Is the correlation between gene and phenotype significant because there is a real cause-effect relationship (what we hope for), or is it an artifact of experimental design? One possible explanation for why results from these studies are not reproducible may have to do with a limitation of experimental design: selection bias. Some genes are studied more than others, thus, any association may be due to more work being done for that gene, not because there is a real relationship between the phenotype and the gene. Two reasons for why some genes receive more attention: it may be because of historical factors, e.g., probes available, its product well-studied in biochemistry, etc. Alternatively, there may be good evidence to support a causal role.

So, in addition to reporting your phenotype of interest (POI) and your gene of interest (GOI), I want you to investigate possible citation bias. Is your gene popular, or is it a rarely studied gene? We can gather this information from two sources

HuGE Literature Finder: https://phgkb.cdc.gov/PHGKB/startPagePubLit.action

From the web site: The CDC Public Health Genomics and Precision Health Knowledge Base (PHGKB) is an online, continuously updated, searchable database of published scientific literature, CDC resources, and other materials that address the translation of genomics and precision health discoveries into improved health care and disease prevention.

Protocol:

1. GOI citations: Find the “Enter a search term” box and enter your gene name, e.g., HIF1A, then click on “Search” button (Fig. 4, red arrow)

Screenshot HuGE Literature Finder, Enter a search term

Figure 4. Screenshot HuGE Literature Finder

2. Select GENE from the “Filtered By:” drop down menu to filter results (Fig. 5).

Screenshot choose Filter By: with "Gene" highlighted

Figure 5. Screenshot HuGE Literature Finder, select  Filtered By: Gene

3. Scroll down results page and find your gene (Fig. 6). Report the number of citations for your gene. For our example, HIF1A was cited 79 times in the CDC database

Screenshot results HuGE, filtered by "Gene"

Figure 6. Screenshot results page from HuGE Literature Finder, select Filtered By: Gene

4.  POI citations: Find the “Enter a search term” box and enter your phenotype, e.g., Blood pressure, then click on “Search” button. See Figure 4.

5. Select Disease from the “Filtered By:” drop down menu to filter results (Fig. 5).

6. On the Results page (Fig. 7), find and report number of disease terms associated with your phenotype. For our example, Blood pressure was indexed 401 times in the CDC database

Screenshot results page HuGE search, Filtered By: Disease

Figure 7. Screenshot results page from HuGE Literature Finder, select Filtered By: Disease

 

Wikipedia Pageview: Described at https://en.wikipedia.org/wiki/Wikipedia:Pageview_statistics#:~:text=The%20pageview%20stats%20tool%20is,%22Tools%22%20for%20registered%20users.

For each Wikipedia page you can get statistics about how popular the page is, at least over the past 30 days.

Protocol:

1. GOI citations: Find the Wikipedia page for your gene, e.g., HIF1A, then find Page information link (Fig. 8, red arrow)

Screenshot Wikipedia, Page information link

Figure 8. Screenshot Wikipedia page for HIF1A, red arrow points to Page information link

2.  From the Information page, find “Page views in the past 30 days”, Fig. 9. For our example, HIF1A was viewed 4731 times in past 30 days (report date of search, e.g., 20 Feb 2021).

Screenshot Wikipedia Information page for HIF1A

Figure 9. Screenshot Wikipedia Information for HIF1A