MySQL Administrator Tool (Workbench 6.0 CE), Eclipse, Apache Tomcat 9 server, Java 1.8.
The PDZ protein data and related information such as sequence, organism information, available structures, localization and related literature were obtained from publicly available Uniprot-KB/Swissprot (release 2017_20) database. Uniprot-KB/Swisprot being the largest and up-to-date manually curated database of proteins was the most important data source for PDZscape. The entire PDZ domain containing data was manually downloaded from Uniprot-KB as flat files and the other related information was extracted out in a CSV (Comma Separated Values) file using Perl regular expressions. This CSV file was further used for the creation of tables in a database using MySQL server. Various information on PDZscape proteins such as their Gene information-Gene IDs, structural information-PDB IDs , protein-protein interaction information- STRING IDs  and Pathway information - KEGG IDs were taken from their respective databases. The number and organizations of PDZ domains have also been included separately. Information on PDZ-interacting proteins was obtained from IntAct database  that is the largest molecular interaction network database, which includes all the interacting partners of a protein. Sequences of all PDZ-containing proteins were assimilated to form the source database of PDZ-BLAST (Basic Local alignment Search Tool). The stand-alone BLAST (2.6.0+) [12, 13] feature was downloaded from NCBI website and integrated in the database. For searching and homology scoring functions, default matrices of BLAST [12, 13] were used. Known mutations were extracted from PMDB and some of them were manually curated from literature. The phenotypic implications associated with each of these mutations, if any, have also been included in the PDZscape output. Extensive manual curation was also performed for obtaining information on the association of PDZ containing proteins with different pathological conditions. Currently, more than 300 entries have been curated with the known mutations and their association with various diseases. This information is available for Human PDZ proteins and will be periodically updated. Manual curation was performed using literature search on PubMed and filtering the papers based on proven disease associations from experimental studies. Literature reports for proteins, where only mutations have been reported without any information on associated diseases, have also been included in the database wherever relevant. It has been observed that disease-association in PDZ-containing proteins is not limited to mutations but depends on other factors as well that include over-expression, chromosomal deletion, inhibition etc. These diseases have also been reported in this database. The entire data takes into consideration all PDZ proteins and can be retrieved in ‘.csv’ format from download page.
To facilitate reverse search and find whether the protein of interest is a PDZ-interacting partner, a simple tool, based on sequence similarity and ID mapping has been developed in Java, which takes UniProt ID of a protein as an input and finds whether it is a PDZ-binding protein. In both protein and peptide mode, this tool scans the given sequence for presence of known PDZ-binding motif, which are stored as regular expressions, based on reports from the literature . PBPFinder is a simple tool that first scans the database of known PDZ-binding proteins with given sequence or Uniprot ID and database of known PDZ-binding motifs in order to report the possibility of given protein being a known or putative PDZ-binding protein.