Thirty-three snowpack models of varying complexity and purpose were evaluated across a wide range of hydrometeorological and forest canopy conditions at five Northern Hemisphere locations, for up to two winter snow seasons. Modeled estimates of snow water equivalent (SWE) or depth were compared to observations at forest and open sites at each location. Precipitation phase and duration of above-freezing air temperatures are shown to be major influences on divergence and convergence of modeled estimates of the subcanopy snowpack. When models are considered collectively at all locations, comparisons with observations show that it is harder to model SWE at forested sites than open sites. There is no universal “best” model for all sites or locations, but comparison of the consistency of individual model performances relative to one another at different sites shows that there is less consistency at forest sites than open sites, and even less consistency between forest and open sites in the same year. A good performance by a model at a forest site is therefore unlikely to mean a good model performance by the same model at an open site (and vice versa). Calibration of models at forest sites provides lower errors than uncalibrated models at three out of four locations. However, benefits of calibration do not translate to subsequent years, and benefits gained by models calibrated for forest snow processes are not translated to open conditions.