Welcome back to Peking University MOOC Bioinformatics: Introduction and Methods.
This video is supplementary learning materials 2: KOBAS demo.
Iâm Chen Xie from National Institute of Biological Sciences, Beijing.
Go to KOBAS web page by visiting kobas.cbi.pku.edu.cn. KOBAS has two main programs, Annotate and Identify.
Click Annotate to go to the program page. Annotate annotates input genes to pathways, diseases, and GO terms.
Let us try the Gene ID example.
Input file can be a file on KOBAS web server,
content in the following clipboard,and a file from local disk.
It is content in the clipboard in this example.
Input file can be a file on KOBAS web server, content in the following clipboard, and a file from local disk.
Besides that, you need to choose the species that you want to annotate with, and it is human in this example.
After filling the location and name of the output file, click Run.
In the result, the first column represents input IDs, and the second and third columns represent mapped KEGG gene IDs and names.
The detailed information about mapped pathways, diseases, and GO terms will be shown after clicking details;
and a web page about other information of a gene on KEGG website will be shown by clicking Gene ID.
The result can be downloaded to local disk.
Then we click Use this file as Identifyâs Input to perform the next analysis on the program page.
Identify performs enrichment analysis for pathways, diseases, and GO terms.
After clicking Show available databases for this sample file,
we can choose databases. Here, due to running time limitation, we only choose four databases.
The background for statistical test is the annotation result of Annotate for background gene set containing the input genes;
and could also be default background, which is the annotation of genes from whole genome.It is default background in this example.
You can choose statistical method and FDR correction method, and set small term cutoff in Options for statistics.
After filling the location and name of the output file, click Run.
In the result, the first three columns are the name, database, and ID of a term;
The fourth column is the input number of that term, and the fifth column is the background number;
the sixth and seventh columns are P-Value and Corrected P-Value separately.
The information of the term in the original database can be accessed by clicking ID;
and genes mapped to that term will appear after clicking input number.
Registered users can save data and analysis history on KOBAS web server.
Users can also download KOBAS standalone command line version, and install and run locally.
The following demo for running command line programs gets identical results as above ones.