Recently, after installing a SharePoint 2016 farm (to learn more on how to install a SharePoint 2016 farm in an automated way using PowerShell click here and here), I was configuring the Search Service Application in a customer’s SharePoint 2016 farm and got into a crawling error while trying to crawl the content of one of the web applications that is configured to use a FQDN URL and https, for example https://intranet.contoso.com. This web application is using a Content Database that was migrated from SharePoint 2010 to SharePoint 2016.
The error that was being thrown was during the Full Crawl of the Content Source was:
“The crawler could not communicate with the server. Check that the server is available and that the firewall access is configured correctly. If the repository was temporarily unavailable, an incremental crawl will fix this error.”
When I saw this message, the first thought that crossed my mind was lack of permissions for the Default Content Access Account. The account was properly configured to have Full Read permissions on the web application and this was not the problem.
My next thought was that this could be a firewall problem and some ports necessary by the Search Service Application to properly crawl the web application content were not opened in the firewall (for a complete list of the necessary ports in a SharePoint 2016 installation, click here or here). However, after the firewall was properly configured, the error remained.
After some searching in the Internet, I tried without any success the following approaches:
- Disabling the loopback check on the server
- Disabling https for the web application
- Creating a new empty web application with a FQDN URL and tried to crawl its contents
- Resetting the search index
- Deleting and recreating the content source
After all the failed attempts, I was able to solve the problem by extending the web application to a non-FQDN URL (for example http://intranet) and configuring this URL as the start address in the Content Source in the Search Service Application. This URL is internal (the host is only configured in the server’s hosts file and is not configured in the DNS) and is being used for crawling purposes only.
Important detail: For the search results to be correctly presented, the newly added URL was configured in the Default Zone in Alternate Access Mappings (AAM) in Central Administration.
The goal here is to allow users to use the FQDN URL (ex: https://intranet.contoso.com) to access the SharePoint Portal and the search results to include this same URL.
If I had configured the non-FQDN URL in a non-Default zone in AAM, the search results would always include the non-FQDN URL instead of the FQDN URL that users use to access the SharePoint Portal. To learn all about using the Default Zone to crawl content and AAM, please read the following two fantastic articles by Brian Pendergrass:
- Beware crawling the non-Default zone for a SharePoint 2013 Web Application
- Alternate Access Mappings (AAMs) *Explained
I was able to configure a FQDN URL as the start address of a Content Source in the Search Service Application in one of my development environments without this problem, so I guess this behavior must be due to some infrastructure configuration in the customer’s environment which I have not yet been able to discover.
If I find a solution that allows the usage of a FQDN URL to be configured as the start address in the Content Source in the Search Service Application, I will update this post.
Hope this helps!
PS: New search capabilities have been introduced in SharePoint 2019. To learn more about them and all the upcoming features in SharePoint 2019, click here.