Semi-supervised learning has recently received considerable attention in machine learning. In this paper, we propose a novel diffusion maps based semi-supervised algorithm for dimensionality reduction, visualization and data representation. Unlike previous work which uses only geometric information for similarity metric construction, a distributional similarity metric is introduced to modify the geometric relationship of samples. This metric is defined using the posterior probability over the labels of each sample, which is learned through the Expectation–Maximization (EM) algorithm. The Euclidean distance between points on the intrinsic manifold learned by our proposed method is equal to the label-dependent “diffusion distance”, which is modified by the distributional similarity related metric, in the original space. Our algorithm preserves the local manifold structure in addition to separating samples in different classes, thus facilitates the classification. Encouraging experimental results on handwritten digits, Yale faces, UCI data sets and the Weizmann data set show that the algorithm can improve the classification accuracy significantly.