TRANSFAC® Release 7.0 - Documentation

Site: Criteria

The first criterion for a site to be included in TRANSFAC® is protein binding, the second is function. Assigned to each site is an unambiguous accession number and an identifier. The latter is composed of a hint onto the species (e. g., HS for human), a code for the gene description and a consecutive number for each entry referring to a particular gene. Thus, HS$BAC_02 refers to the 2nd entry for the human gene for beta-actin.

The description of a gene is the name of the genes itself or of its product, depending on what the more common term may be.

The positions have preferably been taken from DNase I footprinting studies, if available. The next preference is for chemical modifications, the last for gel retardation assays. In case of different positional information for both DNA strands, the more upstream position has been taken for the 5' border, the more downstream position for the 3' border of the site. If not stated otherwise in the S1 field, the position numbers generally refer to the transcription start site. Occasionally (or normally for yeast genes due to their generally more heterogeneous cap site), they may refer to the translation start codon stated as 1:ATG. Other reference systems such as defined restriction sites may be indicated as well. If SF and ST are 0, no positions are given by the references cited. If SF has a negative or positive value, but ST is 0, no precise boundaries of the site have been given but has been located "around position [SF]" instead.

The sequences depicted have been taken from the literature. Some conflicting data with sequences within the EMBL data library are mentioned in the comment field. In case of diverging site borders on both strands, only the overlapping sequence is given. When the authors emphasized a certain sequence motif within a sequence, it is written in capitals while the rest of the sequence is shown in lowercase letters.

Cross-references to the EMBL data library also give the positions of the TRANSFAC® site within the EMBL sequence, negative numbers pointing to the complementary strand.

The factor which binds to this sequence element is given with its TRANSFAC® accession number of the FACTOR table and (one of) its name(s) (see FACTOR table for possible synonyms), and a "quality" value ranging from 1 to 6 and reflecting the experimental reliability of a certain protein-DNA interaction. These values have the following meaning:

  1. functionally confirmed factor binding site binding of pure protein (purified or recombinant) immunologically characterized binding activity of a cellular extract binding activity characterized via a known binding sequence binding of uncharacterized extract protein to a bone fide element

  2. no quality assigned

The cellular protein source used to identify a particular site is included in the SITE table as well.


