PathoDB -

A database on pathological forms

 of transcription factors and binding sites

Manuela Prüß (mpr@biobase.de)1, Thorsten Meinhardt (thorsten.meinhardt@novasoft.de)2, Edgar Wingender (ewi@biobase.de)1
1
Biobase Biological Databases GmbH, Mascheroder Weg 1b, D-38124 Braunschweig
2 Novasoft AG, Im Weiher 1 - 3, D-69121 Heidelberg

Introduction

The existing databases on transcription factors and binding sites (TRANSFAC, TRRD, Compel, TFD; see Heinemeyer et al., 1999, for an overview) are mainly focussing on the molecular and/or genetic aspects of the transcriptional machinery and the interaction of its elements. Besides the deficit of phenotypic information, the data compiled in the databases mentioned above generally deal with the so called "normal", healthy condition of the respective organism. However, aberrations of the transcriptional control due to mutations in the key elements generally cause severe impairments. To achieve a more detailed insight into genotype-phenotype correlations and thereby gaining a deeper understanding of regulatory mechanisms, we established the relational database PathoDB in which we collect pathological data and model the appropriate relations.

Content and structure of the database

Information on the diagnostic methods with which the mutated gene can be detected.

The database encompasses detailed molecular information on mutated transcription factors and regulatory DNA elements (sites) as well as descriptive genotypical and phenotypical data.
Additionally, diagnostic and, if available, therapeutic materials and methods are included. Especially the field of diagnostic methods for mutation detection is taken into consideration with detailed information about PCR conditions and sequencing, for example.
The conjunctions between the individual types of information are realized via multiple links. To access data beyond PathoDB's primary field of interest, external databases (OMIM, HGMD, and MGI, for human or mouse mutations, respectively) are connected to the genotype and phenotype entries. Internally, the database is closely linked to the TRANSFAC system to open up optimal analytic possibilities (links to sites and possibly interacting wild type factors, regulating pathways, cross-comparison of mutated versus wild type entries, etc.).

Information on the genomic defect which underlies the mutated factor, and on the related gene.

Mutated factors and sites and the related phenotypes in PathoDB (surrent status):

Information on the clinical outcome which results from the mutated factor or site - in case of disturbed gene regulation,  often developmental abnormalities or tumor growth are the arising consequences.

Information on the mutated factor or the mutated binding site, its amino acid or nucleotide sequence, its features and functional properties.

Status and Perspectives

The database has been established under a relational database management system (DBMS) and the functional development of the basic features is complete.
More than 10450 sets of data have been entered up to now. So far, PathoDB comprises some thalassemias, caused by mutations in the regulatory regions of the different globin genes, and early developmental disturbances. The latter defects include dwarfisms evoked by mutations or chromosomal rearrangements in the genes for the transcription factors Pit-1 or Prop-1 and neural crest differentation defects caused by mutations of several Pax genes. A large amount of mutated transcription factors which lead to different types of cancer are regarded, too. The different mutated factors (MuFactor) and mutated sites (MuSite) as well as the resulting diseases, regarded in PathoDB, are listed in the above table.
The range of organisms admitted to PathoDB is planned to be as broad as it is in TRANSFAC. However, at the moment only human and murine defects are considered, with special emphasis on the human organism whose genetic diseases might be of greatest interest for medical research, like the wide field of oncological diseases.
For primary WWW access we are going to compile an ASCII flat file version. We finally aim at making the relational PathoDB accessible online.

References

Heinemeyer, T., Chen, X., Karas, H., Kel, A. E., Kel, O. V., Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F. and Wingender, E. (1999): Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res. 27, 318-322