Investigating the differences in emotional perception across residential environments and their underlying mechanisms is essential for enhancing life quality. Using multimodal Weibo UGC data (text and images) from the main urban area of Fuzhou, this study examines emotional perception disparities and environmental impacts among residents in different residential settings. By integrating LSTM-CNN sentiment analysis, XGBoost, SHAP, and ArcMap, we reveal nonlinear effects and variations of residential environments on emotions. The findings indicate: ① Positive emotions account for 68% in formal residential areas versus 83% in urban villages. Formal areas exhibit distinct spatial clustering of emotions with gradual transitions between hotspots and coldspots, whereas urban villages show diffusion patterns, reflecting strong boundary effects and spatial heterogeneity. ② Key positive indicators for urban village residents include paving degree, color complexity, and openness, while formal residential areas are positively influenced by paving degree, visual entropy, spatial vitality, and enclosure. Paving degree and openness exhibit the most significant differential impacts across environments. ③ Excessively high green visibility may be associated with negative emotions. In urban villages, color complexity, visual entropy, and green visibility can mitigate negative emotions induced by enclosed spaces. In formal areas, spatial vitality and enclosure yield more positive outcomes through interaction. This study provides a data foundation for inventory renewal planning in complex residential settings, offers insights into emotional perception differences and environmental mechanisms, and proposes sustainable planning recommendations for enhancing the quality of such environments.