Анастасия П.

Биоинформатик, биостатистик, математик

Ищу научную деятельность в любом ее виде. Готова учиться/копаться в статьях/тратить дни и недели, чтобы в чем-то разобраться. Нужно будет -- полезу в мокрую лабу. Хочется заниматься аналитической деятельностью, а не только "программированием под биологию".

Подходящий формат работы: полная ставка + гибкий график, т.к. я еще учусь, полставки, если очное присутствие необходимо.

Последнее обновление резюме 09.12.2021
Адрес Санкт-Петербург, Russian Federation
Электронная почта Заблокировано
Телефон Заблокировано

Опыт

Dobzhansky Center for Genome Bioinformatics, SPbU
Research Engineer
Окт 2020 - Апр 2021

Образование

Санкт-Петербургский Политехнический Университет им. Петра Великого
Прикладная математика и биоинформатика
Сен 2021 - Июн 2023
Санкт-Петербургский Электротехнический Университет "ЛЭТИ" им. Ульянова (Ленина)
Прикладная математика и информатика
Сен 2017 - Июн 2021

В чем вы сильны?

  • Solid mathematical background: algebra, probability & statistics
  • R, Python, SAS
  • Linux
  • Data structures & algorithms
  • Fundamentals of Neural Networks & Machine Learning
  • Bioinformatics technologies (Next Generation Sequencing, Microarray Genotyping, Comparative hybridization, ChiP-Seq, Hi-C, PCR)
  • Pipelines creation with Snakemake
  • Molecular biology fundamentals
  • Latex

Расскажите о себе что-нибудь еще: публикации, конференции, хобби

My experience includes:

  • Participation in the project devoted to studying genomic integrity of patients with cancer 

(since 09.2020 - 04.2021)

I was responsible for selection of suitable tools for structural variants detection and following pipelines creation with Snakemake (Delly, CNVnator, LUMPY, MELT, GRIDSS). I worked with alignments & variant calls through samtools and bcftools. My colleagues & I worked on development of wavelet-based genomic instability metrics. I tried different coverage normalizations to see their impact on the metrics proposed. Also I analyzed Genome In a Bottle papers & metrics behaviour on their gold standard data. In one of my latest tasks I needed to analyse coverage wave-forms for patterns in certain types of genomic rearrangements for db creation. 

  • Teaching at St.-Petersburg Electrotechnical University “LETI” (since 09.2019 - till now)

Up to now I’ve been teaching 3 types of seminars for 1-3 year students. Namely, linear algebra, probability & statistics. 1-semester statistics course covers: rigorous point estimation (ML, REML), UMVUE theory & construction, hypothesis testing, quadratic forms distribution, LRT, Wald test, UMP tests, confidence ellipsoids, LM, GLM, GLMM, ANOVA, contingency tables analysis, multiple testing issues, basics of epidemiological studies designs.

  • Development of a new mathematically rigorous method to identify genomic regions associated with phenotypes in my Bachelor’s final project. The tool is implemented in R (09.2020 - till now)

I had to estimate the correlation matrix of a multivariate normal vector constructed from partitioned chi-squared statistics from a sequence of contingency tables. First of all, I had to find such an orthogonal transformation so that the chi-squared statistic with (n-1)(m-1) df is a sum of squares of (n-1)(m-1) iid normal variables. For the needs of genetics 2x2 & 3x2 tables have been considered. Point estimate of the correlation matrix has been obtained. A method for controlling for PSD-property of the matrix has been proposed.

  • Analysis of genotyping data in alcohol dependence study (09.2020-05.2021)

Under supervision of a biologist & a mathematician worked with genotyping data of a St.-Petersburg cohort of alcohol dependent subjects. Tasks started from QC & preprocessing, followed by GWAS-analysis and ended with the results analysis and comparison with known associations from publicly available GWAS databases (DOI:10.1016/j.euroneuro.2021.08.262)

  • Neural networks construction & training

During the machine learning course I used Python language and Keras library for neural networks creation & training. I worked with dense, convolutional & recurrent NN for analysis of numeric and text data. Tasks included classification, regression and text generation.