Tuesday, September 18, 2012

Examining a Targeted Search Solution

By EPC Group's Sr. SharePoint Architect Timothy Calunod  


Awhile ago I met a nice fellow at a SPUG meeting who was profoundly interested in improving his organization with technology, and SharePoint was one of the key tools he implemented. He was interested in building a useful, fairly intuitive environment for managers and other users alike, and had a challenge with Enterprise Search. I should say his challenge was not with getting Search up and running, but how to focus it in ways that would be useful for targeted, specific content search and display. During time before the SPUG meeting I worked out with him what he would need to do, and I always found it a very eye-opening experience as to how far Search could actually go in SharePoint. In 2001 and 2003 Search was pretty limited, but beginning in 2007 there was a greater number of configuration and change points that made customizing Search a workable application reality.
That was 2007, but here in SharePoint 2010, Search has become so much more. In the interest of plumbing the depths of SharePoint Search in 2010, I wanted to see if was possible to not only recreate this application but also to see if SharePoint 2010 had additional things that could be done or added to make that search experience even better.


The specific example we will use in our scenario came from a need for this IT Pro to create a targeted application for a very specific group of content focused around limited release products and services as well as a changing base product catalog that needed to be referenced on occasion when building other new limited releases and new product catalogs. The concern was that out of the box Search would index everything and thus a targeted search for this particular body of content would display everything. He wanted to have specific results returned from only that group of focused content. He already had a Search Center in his Collaboration Portal so he wanted to modify what was already existing into something more focused.
To recreate and possibly improve upon that, we need to construct a similar architecture while examining the limitations of SharePoint 2007 with what we can reconstruct and improve upon in SharePoint 2010. With the removal of the Collaboration Portal from SharePoint 2010, the Search Center must be included in any Site Collection to resemble what the previous Portal Template would have had. However, there are limitations for using Site Collection-specific Search Centers, including the limitation of Keywords and Best Bets to that Site Collection. Thus, if any Keyword/Best Bet would be useful in the Enterprise, that would have to be recreated in each Site Collection. The Keyword/Best Bet configuration is but one limitation of the Site Collection-bound Search Center as a Subsite. The challenge here will not be so focused on the limitations but how we constructed the solution and, after recreating the solution, attempt to improve upon it with newer practices and SharePoint 2010.

The lack of a Collaboration Portal does not limit what we can configure, in fact to recreate this we simply need a Site Collection with a Search Center Subsite. Since our only choice for multiple sites in a Site Collection from a Top-Level Site template is the Publishing Portal, we could begin with that, but since most Administrators would likely use Team Sites as a Top-Level Site instead, we will begin our reconstruction from there. Also, the Collaboration Portal also held a News, Site Directory and Document Center set of Subsites, but in our scenario the Document Center was actually the main storage area for this special focus content. Thus we will also build a Document Center in our recreation. The Site Directory is not as useful in this scenario, but the News Site could also be included to differentiate content from other Subsites that should not show in the content search, so we will create a simple Publishing Subsite for the News Subsite.

In 2007, the Shared Service Provider (SSP) that was created to host the Search component has now been replaced by the Search Service Application (SSA), so this will also suffice in creating our crawl and query components to run our content searches. Already the benefit of creating a separate SA for Search has its advantages over the SSP, but this also means we can look to other possibilities for improvements that perhaps we could not achieve in 2007. Additionally, the Search Center Site template has changed with additional Web Parts and may become even more configurable now. Regarding the 2007 Document Center, whatever Document Management features were available at that time were configured and set in the Document Center template, and the new version of the Document Center template takes advantage of several additional new features that can make the Document Center an even more useful Library-based Site. Thus even the Document Center should be examined to discover any additional benefits here in 2010.

Enterprise Search – Revised, Revitalized

Enterprise Search has been growing and changing since the first version of SharePoint and of course, since its roots in Site Server even before SharePoint became the business standard. Thanks to the acquisition of FAST, Enterprise Search has reached even more new levels and becomes increasingly more robust in larger enterprise environments, as well as much more customizable in any environment. Yet the core functionality has remained stable and useful: create and configure a Search Application, target Content Sources and provide a searchable Catalog to present result sets through a customizable interface. Even with all the architectural changes in SharePoint 2010, Enterprise Search still provides that familiar structure so Administrators can focus on how content gets indexed and how result sets get displayed.

As previously mentioned, with the breakup of the SSP and the implementation of Service Applications (SAs), Enterprise Search can be implemented as a single Search Service but also as multiple, different purposed services. This may provide a benefit in our feature enhanced scenario. But the SA architecture alone provides a configuration option the SSP could not meet: purposed and assigned search content. By creating a Search Service Application (SSA), an Administrator may still focus on indexing content from many of the same familiar sources in the past, including File Shares, Web Sites, Exchange Public Folders, the People Database and even Business Data Connectivity Applications. However, the key benefit of the SA architecture is the uncoupling of the Index itself from the single instance of the Index Server. Now referred to as the Crawl Component, the indexing server has an associated Crawl Database and propagates the Index as an Index Partition to a somewhat familiar module, the Query Component. Once referred to as the Search or Query Server, the Index exists as partitions on one or more servers, which can also be mirrored. The Query Component also keeps its own Property Database. For the Crawl Component, the ability to add additional components releases Administrators from being shackled to a single instance for crawling.
Additionally, Enterprise Search has changed in additional other ways to the benefit of the user experience as well. Because of the Federation Object Model, querying against different search engines and at least displaying results not dependent on the native SharePoint Search Engine is native, customizable and can even be merged.

The inclusion of the Connector Framework allows for connections and crawling into new and different content sources and even becomes the new method for leveraging Business Connectivity Services. And on the front-end, new Search Site Templates with more Web Parts that are also extendable provides richer context in which new Search Applications can be made. All of this bearing huge potential for new and more innovative and useful patterns in Enterprise Search.

Document Management – Revisited

Because part of our scenario requires the use of the Document Center, although many of the basic features can easily be reconfigured from a standard Team Site Template, an examination of the Document Center and its general features is also beneficial to determining what can possibly improve our scenario. Although many features of SharePoint have focused on standard management tools such as Check in/out, Versioning and Content Approval, the inclusion of Workflows, Content Types and Information Management Policies extended the capabilities of SharePoint in 2010, along with additional features such as Document ID, Document Sets, and Content Organizer.

However, the Document Center Site Template takes advantage of what was available in 2007 and adds additional support for these features in 2010. This site template continues to be useful and applicable, and even hosts a new home page with additional Web Parts for better Document Management by user. Thus while the Document Center does not impact the back end quite as much as Enterprise Search, for our scenario there can be plenty of additional benefits to be examined and possibly applied.


The initial part of this scenario will require nothing more than a single Web Application with one Site Collection with a Team site Top-Level Site and three Subsites: News, Documents, and Search. We will take our limited availability content and build our Search architecture, leveraging the Search Center and other Search configurations to achieve our goal. Following this, we will examine several other permutations of this scenario and see how far we can improve upon the targeted Search experience as well as what we can use to improve that experience, if at all.


Interestingly enough, the original solution required some work since the specifics were being applied in ways I understood at the time but needed to be configured and tested to discover the operational aspect of such a solution. However, it is the recreation and improvement of this solution which brings us to the gist of the scenario and the focus of the upcoming posts centered on fashioning a 2010 solution.

In my next blog post, we will implement the recreated solution and analyze differences and changes as well as look into the specific configurations needed to achieve our goal.