A well-known technique for large scale kernel methods is the Nystrom approximation. Based on a subset of landmarks, it gives a low rank approximation of the kernel matrix, and is known to provide a form of implicit regularization. We will discuss the impact of sampling diverse landmarks for constructing the Nystrom approximation in supervised and unsupervised problems. In particular, three methods will be considered: uniform sampling, leverage score sampling and Determinantal Point Processes (DPP). The implicit regularization due the diversity of the landmarks will be made explicit by numerical simulations and analysed further in the case of DPP sampling by some theoretical results.