Automatic perception of facial expressions with scaling differences, pose variations and occlusions would greatly enhance natural human robot interaction. This research proposes unsupervised automatic facial point detection integrated with regression-based intensity estimation for facial action units (AUs) and emotion clustering to deal with such challenges. The proposed facial point detector is able to detect 54 facial points in images of faces with occlusions, pose variations and scaling differences using Gabor filtering, BRISK (Binary Robust Invariant Scalable Keypoints), an Iterative Closest Point (ICP) algorithm and fuzzy c-means (FCM) clustering. Especially, in order to effectively deal with images with occlusions, ICP is first applied to generate neutral landmarks for the occluded facial elements. Then FCM is used to further reason the shape of the occluded facial region by taking the prior knowledge of the non-occluded facial elements into account. Post landmark correlation processing is subsequently applied to derive the best fitting geometry for the occluded facial element to further adjust the neutral landmarks generated by ICP and reconstruct the occluded facial region. We then conduct AU intensity estimation respectively using support vector regression and neural networks for 18 selected AUs. FCM is also subsequently employed to recognize seven basic emotions as well as neutral expressions. It also shows great potential to deal with compound and newly arrived novel emotion class detection. The overall system is integrated with a humanoid robot and enables it to deal with challenging real-life facial emotion recognition tasks.