PredictDB Data Repository



Welcome

Here you can find transcriptome prediction models for the PrediXcan family of methods: S-PrediXcan, MultiXcan and S-MultiXcan.

.db files are prediction models, usable by all methods. .txt.gz files are compilations of LD reference for summary-based methods (S- prefix). S-PrediXcan is meant to use the single-tissue LD reference files ("covariances") appropriate to each model. S-MultiXcan uses single-tissue prediction models and a cross-tissue LD reference.


GTEx v8 models on eQTL and sQTL

We have produced different families of prediction models for sQTL and eQTL, using several prediction strategies, on GTEx v8 release data.

We recommend MASHR-based models below. Elastic Net-based are a safe, robust alternative with decreased power.

MASHR-based models

Expression and splicing prediction models with LD reference data are available in this Zenodo repository.

Files:


Warning: these models are based on fine-mapped variants that may occasionally be absent in a tipical GWAS, and frequently absent in older GWAS. We have tools to address this, presented here. A tutorial is available here.


Acknowledging these models: If you use these models in your research, please cite:

If you use S-MultiXcan, we ask you to cite:


Elastic Net

Expression and splicing prediction models with LD references data are available in this Zenodo repository.

Files:


Acknowledging these models : If you use these models in your research, we ask you to cite:

If you use S-MultiXcan, we ask you to cite:


GTEx v7 Expression models

Expression prediction models with LD reference data ar available in this Zenodo repository. The underlying algorithm is Elastic Net.


Additional support information and details are available in the Zenodo repository. You can find release notes in this link.


Acknowledging these models: If you use these models in your research, we ask you to cite:

If you use S-MultiXcan, we ask you to cite:


GTEx v6 Expression models

Expression prediction models with LD reference data are available in this Zenodo repository.


Additional support information and details are available in the Zenodo repository. Please check this link, about the GTEx v7 models, for a few caveats found in GTEx v6 models.


Acknowledging these models: If you use these models in your research, we ask you to cite:

If you use S-MultiXcan, we ask you to cite:


Models from collaborators and other sources:

MESA models

Single-tissue expression prediction models with LD reference data are available in this Zenodo repository. The underlying algorithm is Elastic Net on MESA multi-ethnic cohort.

These models were presented in "Genetic architecture of gene expression traits across diverse populations", Mogil et al, 2018, PLOS Genetics. Please cite if you find these useful.

CommonMind consortium

Single-tissue expression prediction models with LD reference data are available in this GitHub repository. The underlying algorithm is Elastic Net.

These models were presented in "Gene expression imputation across multiple brain regions provides insights into schizophrenia risk, Huckins et al, 2019, Nature Genetics. Please cite if you find these useful.

EpiXcan Models

Expression prediction models with LD reference data are available in this website. The models were trained on Common Mind Consortium, GTEx, and STARNET consortiums. The underlying algorithm is Elastic Net, informed by epigenetic data.

These models were presented in "Integrative transcriptome imputation reveals tissue-specific and shared biological mechanisms mediating susceptibility to complex traits", Zhaneg et al, 2019, Nature Communications. Please cite if you find these useful.

Acknowledgements

GTEx

The Genotype-Tissue Expression (GTEx; Sample size ) project was supported by the Common Fund of the Office of the Director of the National Institutes of Health. All GTEx Data was downloaded from The database of Genotypes and Phenotypes (dbGaP).

DGN

Depression Genes and Networks (DGN; 922 whole-blood samples) Data was provided by Dr. Douglas F. Levinson. We gratefully acknowledge the resources were supported by National Institutes of Health/National Institute of Mental Health grants 5RC2MH089916 (PI: Douglas F. Levinson, M.D.; Coinvestigators: Myrna M. Weissman, Ph.D., James B. Potash, M.D., MPH, Daphne Koller, Ph.D., and Alexander E. Urban, Ph.D.) and 3R01MH090941 (Co-investigator: Daphne Koller, Ph.D.).

Funding

This work is supported by R01MH107666 (H.K.I.), K12 CA139160 (H.K.I.), R01 MH101820 (GTEx), P30 DK20595, P60 DK20595 (Diabetes Research and Training Center), P50 DA037844 (Rat Genomics), and P50 MH094267 (Conte).

Mailing List

Please join our Google Group for general discussion, notification of future changes to our tools, feature requests, etc.

Disclaimer

The models are provided "as is", with the hope that they may be of use, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. in no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the models or the use or other dealings in the models.