Keegan Hines

Sankey

This Sankey plot visualizes the publications from my department (the Center for Learning and Memory at UT) in the past couple years in terms of who the PI was and journals they published in. The thickness of the links corresponds to the number of papers.

Interactive Sankey plot thanks to d3 and GoogleCharts. Here's some python code for pulling down this information from the PubMed API.
import urllib2
import xml.etree.ElementTree as ET

class Link:
    def __init__(self,auth,jour):
        self.author=str(auth)
        self.journal=str(jour)
        self.count=1
        self.name= str(auth)+str(jour)
        
    def __str__(self):
        return("[ "+ "'"+self.author+ "'" + ", " + "'"+self.journal +"'"+ ", " + str(self.count)+ " ],")
       
        
    def addOne(self):
        self.count=self.count+1
        
query=urllib2.urlopen('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=university+of+texas+at+austin+center+for+learning+and+memory&RetMax=100').read()
queryTree=ET.fromstring(query)
IDlist=queryTree[3]

LinkList=[]
for ID in IDlist:
    w=urllib2.urlopen('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id='+str(ID.text)).read()
    root=ET.fromstring(w)
    author=root[0][5].text
    journal=root[0][3].text

    if str(author)+str(journal) in [item.name for item in LinkList]:
        l= [item.name for item in LinkList].index(str(author)+str(journal))
        LinkList[l].addOne()
    else:
        LinkList.append(Link(author,journal))