Major Bioinformatics Sites: Kegg

Kegg is a database of pathways and related information. Visit the site at: http://www.genome.ad.jp/kegg/

You will find a page which summarizes the function of KEGG.

You will see that, in addition to a main table of contents, Kegg has several entry points including:

Explore each link to find out what the database contains.

Now click the headings below to explore some features of Kegg's features.

Click KEGG PATHWAY to access the pathway database. Here you will see pathways listed in 7 groups:

  1. Metabolism
  2. Genetic Information Processing
  3. Environmental Information Processing
  4. Cellular Processes
  5. Organismal Systems
  6. Human Diseases
  7. Drug Development

Under Metabolism Click Carbohydrate, and then Glycolysis / Gluconeogenesis. This takes you to https://www.genome.jp/pathway/map00010, where you will see the reference metabolic pathway map.

What you are looking at on this page are all the reactions of these pathways, and the interconnections with other related pathways (which you can also click on to display a new page containing that pathway).

The boxes containing numbers such as 3.1.3.11 are Enzyme Commission (EC) numbers representing reactions carried out by enzymes.

The reference pathway includes enzymes from all organisms. At the top is a button labelled 'Change Pathway Type' where you can select a specific organism. Choose Homo sapiens and click Go. The map changes to one where certain EC number boxes are coloured in green. These are the enzymes present in humans.

Now view the enzymes present in Mycoplasma genitalium - it doesn't matter which strain you select.

Comparing the pathway maps for human and M. genitalium, record which species has more enzymes for glycolysis/gluconeogenesis.

Locate the reaction carried out by phosphofructokinase - this enzyme is present in M. genitalium.

If you don't know which this is, think about the name. What does a kinase do? In this enzyme, what will it do this to??

The reaction catalysed by the enzyme is indicated by a box with a number in it. This is the Enzyme Commission (EC) Number. The EC number boxes are clickable - you can check the name of the enzyme by clicking the box.

Record the EC number for phosphofructokinase.

Return to the reference pathway. All the EC number boxes will now have white backgrounds.

Click on the EC number box for phosphofructokinase (pfkA and pfkB).

This takes you to a page which tells you everything about this enzyme and has links to many other pieces of more detailed related information - everything in blue can be clicked on and you will gain yet more information, including substrates, products, effectors, inhibitors, pathways, genes, diseases and even structures.

The page is split into several tables each of which represents an orthologous family of genes. Orthologues are homologous genes that have duplicated as a result of a speciation event. Consequently they generally represent the same gene in a different species. Of course convergent evolution can result in several different genes encoding a protein with the same activity. In the case of phosphofrutokinase we have multiple tables representing multiple orthologous families and thus indicating that there are multiple genes encoding proteins having the same activity.

If you look under the genes category within each orthologue table you will find a number of species (indicated by 3-letter abbreviations) and consequently can find which of the orthologous gene families (all of which encode an enzyme having phosphofructokinase activity) are present in E. coli (Search for ECO:). You may need to click the '>> show all' link to expand the box(es) - this can take some time so be patient!!!.

Record the number of phosphofructokinase genes present in E. coli.


Next to the ECO: label, you will find link(s) to the gene(s). The letter(s) and number(s) represent an accession code in a sequence database. If you click one of these links, you will go to a page with details of that sequence including cross-links to other databases.

The sequence database you accessed earlier is known as UniProt.

Record the UniProt accession code for the E. coli pfkA gene.

Record the number of amino acids in the E. coli pfkA protein.

Record (or calculate) the number of bases (nucleotides) in the open reading frame (ORF) which encodes the pfkA protein (including the stop codon).

Continue