Tuesday, September 18, 2012

Building a Search Structure


By EPC Group's Sr. SharePoint Architect Timothy Calunod  


If there is one thing I have learned about SharePoint over the years, it is the constant fluctuation of features and configurations coming and going to the point where it is difficult to predict what exactly has changed since a previous version until you attempt it. And while some things are exactly the same, some things are just different enough to make you have to think through an issue or two a bit further. In our case, recreating the targeted Search solution created in the 2007 version was not as turnkey as it would have seemed. However, due to a number of very beneficial changes to the platform, sometimes these changes can be accepted but not necessarily welcomed.

In my previous post, I spoke about a simple single Site Collection with multiple sites providing a focused Search solution targeted to a Document Center, and our goal was to recreate this solution. Several changes make recreating such a solution a bit more challenging, including:
  • The discontinuation of the Collaboration Portal
  • Changes to the Search Center Site template
  • Changes to the Search settings for a Site Collection
  • Available Features activated in a Site Collection
  • Changes to the Search Architecture
However, the solution is not completely lost, simply mired in a few more configurations and changes that happened automatically in 2007. We will walk through building of the content Sites and the Search structure in this post.

Setup and Configuration

Our solution begins with a simple single Web Application and a single Site Collection. This Site Collection will serve as the root of the Web Application, and will contain four sites to emulate the missing Collaboration Portal Portal Definition that would have contained all of these sites. The Site Collection will begin with a Team Site as the Top-Level Site, with the Enterprise Search Center, Document Center and Publishing Site with Workflow as immediate Subsites. There were other sites in the original solution, but these three will provide us enough variance to create and test the solution. After the Web Application, Site Collection and Subsites are created, we need to activate a few Features, including:
  • Publishing Infrastructure (Site Collection): This is to support the Search Center since it uses Publishing Features
  • Publishing Workflow (Site Collection): This is for the Publishing Site with Workflows for publishing news articles
Without the Publishing Infrastructure activated, the Enterprise Search Center cannot be created although the Site Template will not be hidden from the template choices.

TargetedSearch-EntSearchCreate-Error

To support the full functionality of the Document Center, the Publishing Infrastructure and Document ID features were also necessary, although for the initial solution these are not vital to its completion. The Publishing Workflow is for the Publishing Site to recreate the News Site publishing.

We could have used the Publishing Portal for our solution, since this would have included a Basic Search Center, a Press Release Publishing Site and a Publishing Top-Level Site, but several reasons make the Team Site solution here preferred. First, the Basic Search Center will not meet our requirements as it is not very customizable. Second, the Publishing Portal Top-Level Site is not as Collaboration-friendly as the Collaboration Portal Home Site. Third, the nightandday.master used in the Publishing Portal is pre-branded and would not make a good generic or specific company portal. However, the Basic Search Center is the main issue.

Search Site Changes

There are three available Search Center Site Templates available for use: Basic Search Center, Enterprise Search Center, and FAST Search Center. These replace the Search Center and Search Center with Tabs Site Templates that were available in the past. A quick evaluation of each will help us analyze the usefulness of each one.

Search Center Template

Configuration

Application

Limitation

Basic Search Center

Similar to Team Site, may have Tabs lists for query page and results page if Publishing activated Simple Search, beyond SPF Search but limited to single page Limited customization, will not work for targeted solution

Enterprise Search Center

Similar to Publishing site, will have Tabs lists for query page and results page Customizable Search, supports tabular control for customized Search pages and Results Pages Requires additional configuration to build an adequate targeted solution

FAST Search Center

Similar to Enterprise Search Center As Enterprise Search but used with FAST Search Server Requires FAST Search Server

The Basic Search Center is targeted for a simple customizable interface since it can only use the single page provided at the root of the Site. Much like a Team Site, the Basic Search Center does not use Publishing and thus does not use a Pages Library to store and manage Search Pages. Even if Publishing is activated on the Basic Search Center, the custom Web Pages are not used, although two lists, Tabs in Search Pages and Tabs in Search Results, will become available. However, these cannot be used to create a Search Center with Tabs solution as it functioned in 2007.

TargetedSearch-BasicSiteTemplate-Content
The FAST Search Center, while as customizable as the Enterprise Search Center, works with FAST Search Server and cannot be used outside of FAST being available and configured. Since we are not configuring FAST in our environment, the FAST Search Center cannot be used, and since the Basic Search Center is very limited, we cannot use this either. Thus, we will focus our efforts on the Enterprise Search Center.

The Enterprise Search Center supports a key function to provide our targeted solution: the tabular control. This control allows a designer or administrator to create a web page and link an in-browser tab to the page, forming the tabs as choices in the web page for both a customized Search page as well as a customized Results page that are attached to the tabs. This becomes important later as the ability to create a customized Search and Results page allows us to focus the search to the Document Center as well as display the results just for the Document Center.

A Site Collection also normally uses SharePoint Foundation Search and uses the OSSSearchResults.aspx page for Search results, which also means that to use the Search Center, the Site Collection must be configured to support Search. In 2007, this was automatically configured for the local Search Center in the Collaboration Portal, but because the Collaboration Portal is not used in 2010, configuring the Search Center and the Search Results page must also be done for the Search Center to be used.

TargetedSearch-SearchSettingsConfiguration
This configuration, however, is identical to the 2007 configuration, and thus a custom, specific Search Center can be created outside the Site Collection if desired. Once we have our Search Center configured, the Search Search Application can be configured.

Search Service Application Overview

Search, amongst other advanced services offered in SharePoint 2007, was once a component of the Shared Service Provider, which allowed one or more Web Applications to be associated with it to consume and take advantage of all services configured for the Shared Service Provider (SSP). However, only one SSP could be assigned to one Web Application, and thus a subset of services, such as Search and User Profiles only, or a grouping of similar services, such as two Search services indexing different content, could not be configured. However, with the uncoupling of Services from the SSP, all Services now can be associated with multiple Web Applications, as well as have duplicate Services associated with the same Web Application.

This allows for scenarios mentioned previously to create a better offering of services. The Service Application architecture allows the administrator to decide which Services, as well as how many of the same type, can be deployed and associated with one or more Web Applications in the Farm. Additionally, some services, such as Search, can be published across Farms and thus be provided to other Farms. Because of this architectural change, the setup and configuration for the Search Service Application (SSA) has also become more complex.

An SSA requires the two key server-based components that have been true for Search since the first iteration of SharePoint: an Indexing Server and a Search Server. In 2010, as the Indexing server no longer contains the Index but actually propagates it to the Query Server, it is now a Crawl Component, which links the physical server to a Crawl Database as one component of the Search architecture. The Search Server stores the Index in partitions, and also connects to a Property Store, and thus becomes a Query Component. Both components are necessary to implement Enterprise Search at a minimum.

In our scenario, the physical architecture is not as vital as the SSA or the Search Center, and thus we will keep the discussion related to deployment of Crawl and Query Components to a minimum, implementing what is necessary to get the Crawl component to build the Index Partition and allow the Query Component to run queries and display results.
Following good practice, the SSA has not been constructed since it would require running the Farm Configuration Wizard, which limits the control over the Application Pools and Content Databases for the administrator. Thus, setting up the SSA requires a few steps before we can begin crawling and querying content.

1) Start the Search Services

We will need two Search-based Services started: the SharePoint Search Service and the Search Query and Site Settings Service.
TargetedSearch-SearchServices
These two services will provide the crawling functions and the querying and administration functions.

2) Create the Search Service Application
When creating the SSA, aside from the basics of the name and if FAST will be used, three specific options need to be configured:
TargetedSearch-SSAConfiguration
  • The Search Service Account – This account will be used in the Application Pool ID for network services and will default as the Default Content Access Account
  • The Search Admin Web Service – This will create the Application Pool used to managed the Administrative Web Service for managing the crawling of the Search component
  • The Search Query and Site Settings Web Service – This will create the Application Pool used to managing the querying and load balancing of the Search component
In general, the Default Content Access Account is usually different from the Search Service Account as it accesses content during the crawling process, and can be changed after the SSA is created. Once the configuration for the SSA is in place, the SSA will be built.

3) The Topology is constructed

As the SSA is configured and created, the initial Topology for the SSA will also be generated, including the two main components for Search.

The Crawl Component will have its Crawl Database created and will associate the initial Server as the Crawl Server. Additional Crawl Databases and Crawl Servers may be created and configured as needed. A temporary location for storing the building and updating of the Index Partition will also be configured.

The Query Component will have the Property Database created and will associate the initial Server as the Query Server. The Index Partition will also be generated at this time.
Additionally, the Administration Component, which will host the SSA Administration interface, will also be generated and will be displayed when the administrator clicks links to manage the SSA. A database to store content for this Site will also be created and connected to the Administration Component.
TargetedSearch-SearchTopology
The SSA infrastructure is now complete. This will be used to crawl and query our content structure that was also created previously. However, we have additional configurations on both the back-end and front-end yet to complete.

Observation

Many things have changed for both the collaboration structure and the Search Service and Components. Because of the deprecation of some features and technologies, coupled with newer features and functions, recreating this solution was actually more difficult than it would have seemed. However, there is much more that needs to be completed, including Content Sources and configuration of the Search Center itself. This will be covered in an upcoming post.