Monday, January 17, 2022

Re: Seminar in Oxford

Dear Mark

Many thanks.

I will be in touch nearer the time with the final arrangements.

Best wishes.

Sara

----- Original Message -----
From: "Mark Gerstein" <mark@gersteinlab.org>
To: "Sara Jolliffe" <Sara.Jolliffe@maths.ox.ac.uk>
Cc: "minary" <Peter.Minary@cs.ox.ac.uk>, "glabstracts mbglab" <glabstracts.mbglab@blogger.com>
Sent: Friday, 14 January, 2022 16:25:16
Subject: Re: Seminar in Oxford

Title:

Data Science Topics Related to Neurogenomics

Abstract:

My seminar will discuss various data-science issues related to
neurogenomics. First, I will focus on classic disorders of the brain,
which affect nearly a fifth of the world's population. Robust
phenotype-genotype associations have been established for several
psychiatric diseases (e.g., schizophrenia, bipolar disorder). However,
understanding their molecular causes is still a challenge. To address
this, the PsychENCODE consortium generated thousands of transcriptome
(bulk and single-cell) datasets from 1,866 individuals. Using these
data, we have developed interpretable machine learning approaches for
deciphering functional genomic elements and linkages in the brain and
psychiatric disorders. Specifically, we developed a deep-learning
model embedding the physical regulatory network to predict phenotype
from genotype. Our model uses a conditional Deep Boltzmann Machine
architecture and introduces lateral connectivity at the visible layer
to embed the biological structure learned from the regulatory network
and QTL linkages. Our model improves disease prediction (6X compared
to additive polygenic risk scores), highlights key genes for
disorders, and imputes missing transcriptome information from genotype
data alone. Next, I will look at the "data exhaust" from this activity
- that is, how one can find other things from the genomic analyses
than what is necessarily intended. I will focus on genomic privacy,
which is a main stumbling block in tackling problems in large-scale
neurogenomics. In particular, I will look at how the quantifications
of expression levels can reveal something about the subjects studied
and how one can take steps to sanitize the data and protect patient
anonymity. Finally, another stumbling block in neurogenomics is more
accurately and precisely phenotyping the individuals. I will discuss
some preliminary work we've done in digital phenotyping.

Friday, January 14, 2022

Re: Seminar in Oxford

Title:

Data Science Topics Related to Neurogenomics

Abstract:

My seminar will discuss various data-science issues related to
neurogenomics. First, I will focus on classic disorders of the brain,
which affect nearly a fifth of the world's population. Robust
phenotype-genotype associations have been established for several
psychiatric diseases (e.g., schizophrenia, bipolar disorder). However,
understanding their molecular causes is still a challenge. To address
this, the PsychENCODE consortium generated thousands of transcriptome
(bulk and single-cell) datasets from 1,866 individuals. Using these
data, we have developed interpretable machine learning approaches for
deciphering functional genomic elements and linkages in the brain and
psychiatric disorders. Specifically, we developed a deep-learning
model embedding the physical regulatory network to predict phenotype
from genotype. Our model uses a conditional Deep Boltzmann Machine
architecture and introduces lateral connectivity at the visible layer
to embed the biological structure learned from the regulatory network
and QTL linkages. Our model improves disease prediction (6X compared
to additive polygenic risk scores), highlights key genes for
disorders, and imputes missing transcriptome information from genotype
data alone. Next, I will look at the "data exhaust" from this activity
- that is, how one can find other things from the genomic analyses
than what is necessarily intended. I will focus on genomic privacy,
which is a main stumbling block in tackling problems in large-scale
neurogenomics. In particular, I will look at how the quantifications
of expression levels can reveal something about the subjects studied
and how one can take steps to sanitize the data and protect patient
anonymity. Finally, another stumbling block in neurogenomics is more
accurately and precisely phenotyping the individuals. I will discuss
some preliminary work we've done in digital phenotyping.

Tuesday, October 5, 2021

Re: Speaker Invitation: Northwestern Pharmacology Symposium

ABSTRACT

In my talk, I will discuss interpretable machine learning models for
predicting the impact of genomic variants. These models focus on a
variety of different types of variants, from those in protein-coding
regions to those in non-coding regions, from those associated with
particular diseases, such as cancer or schizophrenia, to those having
a high impact in general, and from those involving single nucleotides
to those that are larger (structural variants). In particular, I will
describe physically based models for predicting cancer driver events,
simple statistical models for finding cancer non-coding drivers, and
interpretable deep learning models for mental disease. For the deep
learning models, I will show how the model's architecture relates to
the overall linear process of splicing or the comprehensive cellular
regulatory network. Finally, I will also highlight a general machine
learning approach for assessing the impact of structural variants.

Monday, June 28, 2021

SPN:

--
My dear friend. I am Abel Roberts, did you get my previous email?

Regards,
Abel Roberts

Tuesday, June 15, 2021

Fwd: Next week seminar.

Title:

Approaches to Genomic Privacy: Leakage Measurement, Data Sanitization
& Blockchain Storage

Abstract:

Functional genomics experiments on human subjects present a privacy
conundrum. On the one hand, many of the conclusions we infer from
these experiments are not tied to the identity of individuals but
represent universal statements about biology and disease. On the other
hand, the raw sequencing reads or the phenotypic information inferred
from these experiments can leak information about patients' variants,
presenting privacy challenges in terms of data sharing. There is a
great desire to share data as broadly as possible. Therefore,
measuring the amount of variant information leaked in various
experiments is a key first step in protecting private information. We
propose metrics to quantify private information leakage in functional
genomics data, linking attacks to validate the proposed metrics and
file formats that maximize the potential for data sharing while
protecting individuals' private information. Blockchain potentially
provides a way of storing genomic information and access to it
securely. We show how this can be done efficiently using various index
structures and how blockchain can be combined with our file formats
for sharing functional genomic information.
==

i0cdc

Saturday, June 12, 2021

HI

My name is Havilah Anthony, excuse me for bothering you but i have some important information's for you, so contact me back for more details thanks

Thursday, January 14, 2021

Re: [EXT] Abstract for MDACC Hogg seminar series

Title:

Analyzing the non-coding part of the cancer genome

Abstract:

My talk will focus on leveraging thousands of functional genomics
datasets to annotate the cancer genome and perform data mining to
discover cancer-associated regulators and variations.

First, I will go over the ENCODE annotations related to the cancer
genome. I will introduce our computational efforts to perform
large-scale integration to accurately define distal and proximal
regulatory elements (i.e., the MatchedFilter tool). Then I will show
how this extended gene annotation allows us to place oncogenic
transformations in the context of a broad cell space; here, many
normal-to-tumor transitions move towards a stem-like state, while
oncogene knockdowns show an opposing trend.

Next, I will look at our comprehensive regulatory networks of
transcription factors and RNA-binding proteins (TFs and RBPs). I will
showcase their value by highlighting how SUB1, a previously
uncharacterized RBP, drives aberrant tumor expression and amplifies
the well-known oncogenic TF MYC.

Third, I will introduce a workflow to prioritize key elements and
variants. I will showcase the application of this prioritization to
somatic burdening, cancer differential expression, and GWAS (the
LARVA, MOAT & uORF tools). Targeted validations of the prioritized
regulators, elements, and variants demonstrate the value of our
annotation resource.

Finally, I will describe how ENCODE annotations can be applied to the
comprehensive PCAWG mutation dataset. The goal is to determine the
overall burdening of various elements in cancer genomes. I will show
how this correlates with patient survival time and tumor clonality. I
adapt an additive-effects model from complex-trait studies to show
that putative passengers' aggregated effect, including undetected weak
drivers, provides significant additional power (~12% additive
variance) for predicting cancerous phenotypes beyond identified driver
mutations. Furthermore, this framework allows one to estimate
potential weak-driver mutations in samples lacking any
well-characterized driver alterations.

URLs:

http://encodec.encodeproject.org
http://radar.gersteinlab.org
http://MatchedFilter.gersteinlab.org
http://pcawg.gersteinlab.org

==
i0mda21