Learn More
This paper evaluates the variable selection performed by several machine-learning techniques on a myocardial infarction data set. The focus of this work is to determine which of 43 input variables are considered relevant for prediction of myocardial infarction. The algorithms investigated were logistic regression (with stepwise, forward, and backward(More)
Privacy concerns are among the major barriers to efficient secondary use of information and data on humans. Differential privacy is a relatively recent measure that has received much attention in machine learning as it quantifies individual risk using a strong cryptographically motivated notion of privacy. At the core of differential privacy lies the(More)
The problem of disseminating a data set for machine learning while controlling the disclosure of data source identity is described using a commuting diagram of functions. This formalization is used to present and analyze an optimization problem balancing privacy and data utility requirements. The analysis points to the application of a generalization(More)
One of the fundamental rights of patients is to have their privacy protected by health care organizations, so that information that can be used to identify a particular individual is not used to reveal sensitive patient data such as diagnoses, reasons for ordering tests, test results, etc. A common practice is to remove sensitive data from databases that(More)
MOTIVATION Interpretation of classification models derived from gene-expression data is usually not simple, yet it is an important aspect in the analytical process. We investigate the performance of small rule-based classifiers based on fuzzy logic in five datasets that are different in size, laboratory origin and biomedical domain. RESULTS The(More)
Monitoring vital signs and locations of certain classes of ambulatory patients can be useful in overcrowded emergency departments and at disaster scenes, both on-site and during transportation. To be useful, such monitoring needs to be portable and low cost, and have minimal adverse impact on emergency personnel, e.g., by not raising an excessive number of(More)
Data originating from biomedical experiments has provided machine learning researchers with an important source of motivation for developing and evaluating new algorithms. A new wave of algorithmic development has been initiated with the publication of gene expression data derived from microarrays. Microarray data analysis is particularly challenging given(More)
Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and data-mining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine(More)