Sunday, November 12, 2017

SharePoint alerts stop working when you add a new web front-end server to the farm


SharePoint alerts is a very useful user notification service. It is heavily relying on out-going email configuration. There are many on-line articles talk about how to setup out-going email services and troubleshoot alerts issues. Today, I would like to share a different scenario I had when alerts were not working properly.

We have a SharePoint farm and alerts have been working properly for years. In the past few months, we were in the process to add a new web front-end server to replace the current one, not doing load balancing (regarding this decision, it is another story).

It took sometime to install and configure the new server to make it ready to join the farm. After I ran the configuration wizard, I noticed that some of the immediate alerts stopped working. However, I still received daily and weekly summary alerts and emails sent by workflows were fine.

Obviously, out-going email service and configuration was working OK, otherwise I would not receive any emails from SharePoint. Alerts did not stop working entirely, only immediate alerts were stopped. Nevertheless, it did show that this problem has something to do with the new server.

When I checked the server events, I did see errors regarding "cannot connect to the SMTP mail server". At the time I saw these errors, the new server were not opening to end users and I was the only one using local Hosts file to gain access to SharePoint. The in-coming email service was running fine on the old web front-end server and I did not install SMTP components to the new server. There was only one service running on the new server, which is the SharePoint foundation web application service to allow users to access SharePoint sites and content.

No changes were made to either email, Exchange or alerts configuration and this service has been running fine for years from the existing web front-end one. I knew the new one could not connect to the email host, but the old one should and why alerts stopped working?

After some research, I thought perhaps SharePoint needed everyone on the same page, otherwise services would stop working even though one of the servers in the farm was capable doing that. In order to eliminate this possibility, I added the new server to the Exchange anonymous relay list. I added the old web front-end server to it when I established the farm. Few minutes after the new server is added, immediate alerts started working. The email host connection errors were gone and everything went back to normal.

So, it did not occurred to me that SharePoint is working like a gang: you don't go and we don't go. :) Even though other servers could handle alerts, SharePoint did not want leave one of them behind and decided not to functioning as it should be. 

Lessons learned.