Help: Data Analysis - Upload your own data

  1. Chose the type of analysis you want to perform from the Data Analysis menu (Pathway, Gene Ontology, Network, Interactor, TFBS , eQTL ) and select a file to upload by clicking on the "Upload File" button - upload a tab-delimited file of protein/gene identifiers or accession numbers (human, murine or bovine gene/protein identifiers only).
    Alternatively, click on the "Web Form" button and paste your tab-delimited data in the text box (max. 1000 lines)

    Note: There should be only one accession number per row. Probes that map to multiple genes should be removed.

    Accession numbers from the following databases are currently accepted:
    • Ensembl
    • RefSeq
    • Entrez
    • UniProt
    • InnateDB/AAP (gene IDs only)
    For eQTL analyses, you can also upload a list of SNPs using dbSNP identifiers.
    We strongly recommend to use Ensembl identifiers since they have a one-to-one mapping to InnateDB/AAP gene identifiers. Identifiers which map to multiple genes (e.g. some UniProt identifiers) will be ignored.
  2. Click on the column headers to specify which column in your data file contains the identifiers/accession numbers for each gene (and which database they come from). This is called the "Cross-reference ID".
    You can only specify one cross-reference ID column. Please note that when using identifiers from InnateDB/AAP, only gene IDs are allowed, not interactions IDs!
  3. Specify the Cross-reference database. This is the database where the identifiers in the cross-reference column come from.
  4. If you have included gene expression data - identify which columns contain the gene expression values and their associated p-values.
    You may also identify the column containing the probe IDs if you have included them in your file.
    Including quantitative data such as gene expression data is optional but a very useful way to investigate quantitative data in a pathway and interaction network context and to carry out subsequent analysis such as Pathway Over-representation Analysis. It is used to include gene expression values in your file that are mapped to molecule cross-references.
    Expression values must be in the format where a value of +2 represents a 2 fold increase in expression and a value of -2 a 2 fold decrease in expression.
    You can specify values from up to ten different conditions or time-points. You can also specify a name for each condition.

Filter the Network Analysis results
You can choose to filter the results by using one of the following methods:

  • Do not filter the results
    This will return all interactions that involve genes/proteins in the uploaded list.
  • Only show interactions between uploaded molecules
    This will ONLY return interactions BETWEEN genes/proteins in the user-uploaded list. i.e. if molecule A interacts with B and C but only A and B are in your file, the interaction between A and C will not appear in the returned results.
    This is very useful to construct a network of interactions only between molecules in the uploaded list (e.g. differentially expressed genes).
  • Filter for interactions in pathway
    This option limits the interactions returned to a particular pathway. You can search for any of the + pathways from all data sources by typing the name of the pathway in the text box and by selecting one of the given choices.
  • Include orthologous interactions
    Checking this box will return interactions that have been inferred via orthology in other species (human, mouse & cow only).
  • Return InnateDB-curated interactions only
    This will limit the results returned to only interactions that have been annotated by the InnateDB curation team.
  • Only return InnateDB-curated interactions relevant to Allergies and Asthma
    This will limit the results returned to only interactions relevant to Allergies and Asthma that have been annotated by the InnateDB curation team.