Corpus ID: 211205200

Do We Need Zero Training Loss After Achieving Zero Training Error?

@article{Ishida2020DoWN,
  title={Do We Need Zero Training Loss After Achieving Zero Training Error?},
  author={Takashi Ishida and Ikko Yamane and Tomoya Sakai and Gang Niu and Masashi Sugiyama},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.08709}
}
  • Takashi Ishida, Ikko Yamane, +2 authors Masashi Sugiyama
  • Published 2020
  • Mathematics, Computer Science
  • ArXiv
  • Overparameterized deep networks have the capacity to memorize training data with zero training error. Even after memorization, the training loss continues to approach zero, making the model overconfident and the test performance degraded. Since existing regularizers do not directly aim to avoid zero training loss, they often fail to maintain a moderate level of training loss, ending up with a too small or too large loss. We propose a direct solution called flooding that intentionally prevents… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Citations

    Publications citing this paper.

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 58 REFERENCES

    Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

    VIEW 1 EXCERPT

    SGD on Neural Networks Learns Functions of Increasing Complexity

    VIEW 2 EXCERPTS

    Deep Double Descent: Where Bigger Models and More Data Hurt

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL