• Publications
  • Influence
Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer.
TLDR
A previously undescribed mathematical method is introduced, probably approximately correct (PAC) sorting, for evaluating the robustness of reported predictive gene lists, and for several published data sets the number of samples that are needed to achieve any desired level of reproducibility.
Outcome signature genes in breast cancer: is there a unique set?
TLDR
It is shown that the resulting set of genes is not unique; it is strongly influenced by the subset of patients used for gene selection, and the correlations fluctuate strongly when measured over different subsets of patients.
Sorting points into neighborhoods (SPIN): data analysis and visualization by ordering distance matrices
TLDR
A novel unsupervised approach for the organization and visualization of multidimensional data that brings to light the continuous variation of heterogeneity--starting with homogeneous tumor samples and gradually increasing the amount of another tissue.
Outcome signature genes in breast cancer: is there a unique set?
TLDR
It is shown that in fact the resulting set of genes is not unique; it is strongly influenced by the subset of patients used for gene selection, and the correlations fluctuate strongly when measured over different subsets of patients.
Functional gene groups are concentrated within chromosomes, among chromosomes and in the nuclear space of the human genome
TLDR
The human genome shows substantial concentration of functional groups within chromosomes and across chromosomes in space, and members of the same functional group that reside on distinct chromosomes tend to co-localize in space.
Active Learning for BERT: An Empirical Study
TLDR
The results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the minority class.
Gene expression profiles of AML derived stem cells; similarity to hematopoietic stem cells
TLDR
It is shown that DAPT, an inhibitor of gamma-secretase, a protease that is involved in Jagged and Notch signaling, inhibits LSC growth in colony formation assays, supporting the suggestion that the LSC originated within the HSC progenitors.
Corpus Wide Argument Mining - a Working Solution
TLDR
This work presents a first end-to-end high-precision, corpus-wide argument mining system, made possible by combining sentence-level queries over an appropriate indexing of a very large corpus of newspaper articles, with an iterative annotation scheme.
Low autocorrelated multiphase sequences.
The interplay between the ground-state energy of the generalized Bernasconi model to multiphase, and the minimal value of the maximal autocorrelation function, C(max)=max(K)/C(K)/, K=1,...,N-1, is
TR9856: A Multi-word Term Relatedness Benchmark
TLDR
The new TR9856 dataset is proposed which focuses on multi-word terms and is significantly larger than existing datasets, and further handles term ambiguity by providing topical context for all term pairs.
...
1
2
3
4
...