Intrepid Bioinformatics’ system offers a variety of features for managing your genotype and SNP data.  We organize data for meaningful use so it’s not a matter of simply putting datasets onto a disk someplace where they can be stored and backed up.  We’ve put together a system that will allow users to put NGS, SNP and genotype sequence data into a data management system and then “slice and dice” it however it is appropriate to answer the questions that their hypothesis requires.

Gene Search

The gene browser is available to all users – public and private – and is used to view the genomes of organisms uploaded to the system.  Public users are able to view several genomes made publicly available.  Private users can upload their own genomes and other genetic data and use the browser to view their data online.

Several search options are available to view your data.  The gene search allows for searching based on species, gene name, or Entrez ID number.  Other searches include accession number, genomic region, and sample searches, all of which incorporate your private and public data into the results.

Multiple assemblies/builds of genomes can be uploaded and viewed.  In this example, we will be searching the Prion Protein (PRNP) in humans.  We’ve expanded the gene and will select build 36.3.


(click to enlarge)

Selecting the build you desire will open up our Gene Browser where genomic data can be viewed.  The SNP track across the top displays splice isoforms that have been identified for this Prion protein in humans along with repeats, exons and introns.  Custom tracks that can be mapped to genomic coordinates can be added upon user request.  You can zoom into specific sections and click on them for additional details.

(click to enlarge)

Below the cartoon is a text-based representation of every feature in the cartoon.  Where we have more information you’ll see a nonzero number in the “Genotype Count” column heading.

(click to enlarge)

Clicking on the first polymorphism where we have genotypes (446 of them) and expanding it displays the allele frequencies across the populations for whom we have data.  We can click on any sub-population bar to see actual genotype data derived from the Illumina SNP chip for the Caucasian European population.

(click to enlarge)

Enough information is available to know that the first 3 records are related as daughter, mother, father.  Clicking on the genotype details for the father (3rd row), we can see what more is known and what the genotype call was.

(click to enlarge)

In this case we have companion NGS data for this individual and can, in real time, access the BAM file, find the position in the genome that corresponds to this polymorphism, and from the reads that map to that position, extract the alleles we see there.

(click to enlarge)

What we see here is reassuring that the allele counts substantiate the notion that this is a C/T heterozygote for the individual.

Polymorphism Search

Search can also be conducted by polymorphism.  For this example, we will type “rs2261567″ in the search field and click search.

(click to enlarge)

Expanding the 2nd row for position 61 displays a polymorphism where the minor allele frequency is quite large, and reversed in some populations.

(click to enlarge)

Clicking on the CEU sub-population bar in the image above directs us to the genotypes for the polymorphism where we notice something interesting.  We know the relationship of the first three to be daughter, mother, father unequivocally because we have enough SNP chip data.

(click to enlarge)

For this genotype assay, the genotypes do not obey Mendelian inheritance.  We can dig deeper by clicking the father’s genotype details.

(click to enlarge)

We see a C/C call, but we can look at the allele count that’s derived from the NGS data on this individual and see very clearly that there is strong evidence the father is heterozygous which is what’s required for Mendialan inheritance to be correct.

Clicking on “View in IGV” allows us to drill-down further to see what more we can learn about this locus to let us know why we weren’t able to view the correct genotype.  This will open up our modified version of the Broad Institute’s Integrative Genomics Viewer (http://www.broadinstitute.org/igv/).

(click to enlarge)

We click “Load” and are taken to the corresponding position where we see that the individual is heterozygous.  The position which we find the T’s, 3 BP upstream you see another T that differs from the reference alleles, and then 10BP further upstream is a C that differs from the reference.  This suggests that we’re seeing allelic dropout.  The probe designed to genotype this position wasn’t able to because of the differing alleles from the reference sequence against which the assay would have been designed.

(click to enlarge)

Gene Browser

Selecting a build will launch the gene browser

Gene browser

Additionally, complete polymorphism information is displayed, detailing position, feature type, alleles, and other relevant information.

Polymorphisms

Polymorphisms can be expanded to show the polymorphism confirmation as well.

Poly confirmations

Polymorphism details

File and Data Management

Uploading data can be done simply by dragging and dropping supported file types into the upload tool.

File upload

Uploaded files are stored in their original format for easy retrieval.  Additionally, these files can be shared to other users of the system, or even made publicly available to everyone if you wish.  Behind the scenes, files are parsed so  they can be viewed in detail in the gene browser.

Share files

Security and Collaboration

Finally, Intrepid’s system allows for allowing other users to view your data.  By default, all data is private to only yourself.  However, you may choose to add collaborators to work with who can then view the data you make available to them.  Only the files you choose are shared, so you can have multiple collaborators all only seeing what you have designated specifically to them.

Collaboration