kernel limitations
NNGP experiment on CIFAR
kernel methods on larger datasets
data samples needed damian work withh polynomials
Question: Is there any work that says if a neural network is an universal approximator so is it’s NNGP and NTK? I think this says it for NTK, at least for ReLU networks. This might also be related.
The thing about these kernel models is they can’t build up hierarchical features like neural networks do. This matters