The Fertility Relevance Probability score assigned to single genes is based on results of binary logistic regression analyses of two datasets (M and H), as conducted with SPSS 23 (IBM) using the “Forward LR” method. The score can take any value from zero to 1. Higher values should reflect greater diagnostic potential of a gene and its transcripts and proteins as markers of male fertility impairment.
Genes in Murine Phenotypes analysis were categorized as either candidates or non-candidates according to phenotypes in male knockout mice (Mouse Genome Informatics). Categorization in Human Phenotypes analysis used the available literature (Pubmed). Variables included in logistic regressions were:
- dN/dS: pairwise dN/dS estimates calculated for 1-to-1 orthologues of human and mouse as downloaded from Ensembl version 86
- Network parameters (node degree, closeness centrality, betweenness centrality): extracted from a human protein-protein interaction (PPI) network generated and analyzed with Cytoscape 3.4.0, using input data from IntAct, APID, MINT, DIP-IMEx, MatrixDB, InnateDB-IMEx (all integrated by PSICQUIC) as well as BioGrid (version 3.4.141)
- Closeness to candidate genes in the PPI network: minimal shortest path to other candidate proteins and number of directly neighboured candidate markers
- Expression fold change: calculated from project E-MTAB-2836, contrasting RNA expression in human testis with expression in human brain, heart, and ovary
Reference: Thomas Greither, Julia Schumacher, Mario Dejung, Hermann M Behre, Hans Zischler, Falk Butter, Holger Herlyn (2020) Fertility relevance probability analysis shortlists genetic markers for male fertility impairment. Cytogenetic and Genome Research DOI: 10.1159/000511117