Adaptive n-ary Activation Functions for Probabilistic Boolean Logic

@article{Duersch2022AdaptiveNA,
  title={Adaptive n-ary Activation Functions for Probabilistic Boolean Logic},
  author={Jed A. Duersch and Thomas A. Catanach and Niladri Das},
  journal={ArXiv},
  year={2022},
  volume={abs/2203.08977}
}
Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our nary, or n-argument, activation functions fill this gap by… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 28 REFERENCES
Logical Activation Functions: Logit-space equivalents of Boolean Operators
TLDR
An efficient approximation named ANDAIL (the AND operator Approximate for Independent Logits) utilizing only comparison and addition operations is constructed, which can be deployed as an activation function in neural networks.
Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)
TLDR
A novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations that ensures norm preservance of the backpropagated gradients and is potentially good for the training of deep, extra deep, and recurrent neural networks.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Parsimonious Inference
TLDR
The approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation, thus addressing a fundamental challenge in how to efficiently obtain predictions from data.
Understanding Boolean Function Learnability on Deep Neural Networks
TLDR
This paper analyzes boolean formulas associated with the decision version of combinatorial optimisation problems, model sampling benchmarks, and random 3-CNFs with varying degrees of constrainedness to indicate that relatively small and shallow neural networks are very good approximators of the associated formulas.
In Search for a SAT-friendly Binarized Neural Network Architecture
TLDR
This work analyzes architectural design choices of BNNs and discusses how they affect the performance of logic-based reasoners and proposes changes to the BNN architecture and the training procedure to get a simpler network for SAT solvers without sacrificing accuracy on the primary task.
Dendritic action potentials and computation in human layer 2/3 cortical neurons
TLDR
In these neurons, a class of calcium-mediated dendritic action potentials whose waveform and effects on neuronal output have not been previously described are discovered, which enabled the dendrites of individual human neocortical pyramidal neurons to classify linearly nonseparable inputs—a computation conventionally thought to require multilayered networks.
Generalizing Information to the Evolution of Rational Belief
TLDR
A general theory of information from first principles that accounts for evolving belief and recovers information measures based on Shannon’s concept of entropy, which illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data.
Theory of Probability: A Critical Introductory Treatment
Part 7 A preliminary survey: heads and tails - preliminary considerations heads and tails - the random process laws of "large numbers" the "central limit theorem". Part 8 Random processes with
A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH
of the number of bits required to write down the observed data, has been reformulated to extend the classical maximum likelihood principle. The principle permits estimation of the number of the
...
...