Paul Galvin's (old) SharePoint space [SharePoint

Just another WordPress.com site

Category Archives: SharePoint Search

Announcing the Microsoft Enterprise Search User Group

My blog has moved.  Please update your bookmarks accordingly: http://www.mstechblogs.com/paul

This article has moved: http://www.mstechblogs.com/paul/announcing-the-microsoft-enterprise-search-user-group

</end>

Subscribe to my blog.

Follow me on Twitter at http://www.twitter.com/pagalvin

Advertisements

Governance and SharePoint Search – It’s Never Too Late to Start

I wrote an article (http://searchwinit.techtarget.com/tip/0,289483,sid1_gci1345231_mem1,00.html#) for SearchWinIT.com on governance as it relates to SharePoint Search.  It’s not in my usual "voice" but that’s editing for you 🙂

Here is how it starts:

Although nearly every aspect of SharePoint can benefit from a strong governance plan, MOSS 2007’s enterprise search functionality benefits most of all.

Like all parts of SharePoint, there is good news and bad news about governance. For many organizations, the bad news is that it’s extremely difficult to incorporate a governance plan where none existed.

But here’s the good news: You can quickly configure and improve on enterprise search at almost any time. And when you implement a governance plan for enterprise search, you can see immediate results.

One of the problems with SharePoint and governance is that companies often get knee deep into SharePoint with no governance plan and by then, there’s no easy path forward to solve it.  No so with Search.  Read the article to get my thoughts on that subject.

</end>

Subscribe to my blog.

Follow me on Twitter at http://www.twitter.com/pagalvin

Technorati Tags: ,

Services on Server Does Not List Search — Why?

I was chatting today with Agnes Molnar (the only person I know that I know in Hungary) about a strange search configuration problem.  Namely, search was missing from the "services on server" display (via Central Admin -> Operations -> Services on Server).

I had a look at a functional VM on my own machine and together, we determined that search was not installed on that server.  There are probably a few ways to do this, but we did it by confirming that "Office SharePoint Server Search" was missing from the list of services via Start -> Administrative Tools -> Services.

Oddly, the associated .exe *was* on the server ("C:\Program Files\Microsoft Office Servers\12.0\Bin\mssearch.exe").

I did a quick search and found this blog entry: http://msmvps.com/blogs/obts/archive/2006/10/19/189466.aspx

That’s an email chain with this key point:

"I solved this problem. It was my mistake. I choose "Web front end" instead of "Complete" during install."

This was promising, but we weren’t sure if the installer had actually picked WFE instead of complete when installing MOSS. 

We checked for the first (earliest) version of the PSCDiagnostics* file in the 12 hive log directory and in there, we found that the installer had, in fact, configured this server to be a web front end.  End of story and it had a happy ending.

(Somewhere along the line, Bob Fox got involved, but all I remember him contributing to the discussion was a comment about Fable 2).

Update: Agnes blogs about this subject here: http://dotneteers.net/blogs/aghy/archive/2008/11/06/wfe-vs-complete-installation.aspx

</end>

Subscribe to my blog.

Follow me on Twitter at http://www.twitter.com/pagalvin

Technorati Tags:

Quick Tip: Use “IsDocument:1” to Trim Search Results

Update 11/03/08: Fellow MVP Mike Walsh correctly points out that this is a WSS 3.0 / MOSS feature.  It does not work in WSS 2.0 or earlier.

Updatte 11/03/08: (Second update in one day!): Be sure to read the excellent comment from "nowise" for more info and another good xref link.

Two questions came up in rapid succession this week on the MSDN forums asking a variation of this:

"When I search a keyword, folders from my document library with that keyword in their path will come out first in my search results. I don’t want that to happen. Files with that keyword are more important to me.  I don’t want to see folders at all."

This is actually quite easy to do out of the box.  Simply add a "IsDocument:1" to the search query and SharePoint search (both WSS and MOSS) will restrict itself to showing actual documents.

</end>

Subscribe to my blog.

Follow me on Twitter at http://www.twitter.com/pagalvin

Technorati Tags:

Has Your Search Committee Met This Month?

It’s the beginning of the month and now is as good a time as any for your company’s search committee to get together and analyze Best Bets, successful and not so successful searches, etc.

You don’t have a search committee?  Then form one 🙂

WSS and especially MOSS search benefit from some human oversight.  Investing a few hours a month on a consistent monthly basis  is not only more fun than a barrel of monkeys, it can:

  • Give insight into the information needs of the enterprise.  If people are searching left and right for topic "xyzzy," you know that’s an important topic to the enterprise. 
  • Identify potential training requirements.  If people are searching for topic "xyzzy" but should really be searching for "abcd" then you can use that to educate folks on where and how to find the information.
  • Help your organization refine its information architecture. 
  • Identify opportunities to enhance the thesaurus.
  • Other opportunities will no doubt present themselves.

Who should be on search committee?  You would know your people best, but consider:

  • At least one (and maybe only one) IT person who understands (or can learn) the various ways to tweak search, including best bets, thesaurus, managed properties, etc.
  • Several subject matter experts that can read the search reports, ingest it and communicate business-savvy actions to IT so that IT can push the buttons, pull the levers and open/close valves as necessary to on committee recommendations.
  • One or more information architects who can validate, one way or another, whether the information architecture is search friendly and whether it’s working out well for the enterprise.
  • A rotating seat on the committee.  Bring in one or two people who don’t normally participate in these kinds of efforts.  They may bring unusual and valuable insights to the table. 

Happy analyzing!

</end>

Subscribe to my blog.

Technorati Tags:

Configure Thesaurus in MOSS

I’m working on an architecture review document this week and it suggests, among other things, that the client consider using the thesaurus to help improve the end user search experience.  Having never done this myself, I wanted to do a quick hands-on test so that my suggestion is authentic. 

It was surprisingly difficult to figure out how to do, although it is, in fact, quite easy.  There’s a pretty good bit of information on the thesaurus (check here and here, for example).  However, those docs are either WSS 2.0 / SPS 2003 oriented or they don’t actually spell out what do to after you’ve made your changes in the thesaurus.  They provide a great overview and fair bit of detail, but it’s not enough to cross the finishing line.

These steps worked for me:

  1. Make the changes to the thesaurus.  (See below for an important note)
  2. Go to the server and restart the "Office SharePoint Server Search" service.

A tip of the hat to Mr. J. D. Wade (bio).  He provided the key bit about restarting the search service and rescued me from endless, time consuming and unnecessary iisresets and full index crawls.  This episode proves, once again, that Twitter is the awesome.  (Follow me on twitter here.  I follow any SharePoint person that follows me).

I don’t know if this functionality is available in WSS.  If it is or is not, please leave a comment or email me and I’ll update this post.

Important note: There’s conflicting information on which XML thesaurus file to change.  There’s this notion of "tsneu.xml" as being the "neutral" thesaurus.  I wasted some time working with that one.  In my case, I needed to change the "tsenu.xml" file located under the folder of the app ID itself: \\win2003srv\c$\Program Files\Microsoft Office Servers\12.0\Data\Office Server\Applications\3c4d509a-75c5-481c-8bfd-099a89554e17\Config.  I assume that in a multi-farm situation, you would make this change everywhere a query server runs.

</end>

 Subscribe to my blog.

Technorati Tags: , ,

SharePoint and FAST — the Reese’s Peanut Butter Cups of Enterprise Apps?

I’ve finished up day 2 of FAST training in sunny Needham, MA, and I’m bursting with ideas (which all the good training classes do to me).  One particular aspect of FAST has me thinking and I wanted to write it down while it was still fresh and normal day-to-day "stuff" pushed it out of my head.

We SharePoint WSS 3.0 / MOSS implementers frequently face a tough problem with any reasonably-sized SharePoint project: How do we get all the untagged data loaded into SharePoint such that it all fits within our perfectly designed information architecture?

Often enough, this isn’t such a hard problem because we scope ourselves out of trouble: "We don’t care about anything more than 3 months old."  "We’ll handle all that old stuff with keyword search and going-forward we’ll do it the RIGHT way…"  Etc. 

But, what happens if we can’t scope ourselves out of trouble and we’re looking at 10’s of thousands or 100’s of thousands (or even millions) of docs — the loading and tagging of which is our devout wish?

FAST might be the answer.

FAST’s search process includes a lot of moving parts but one simplified view is this:

  • A crawler process looks for content.
  • It finds content and hands it off to a broker process that manages a pool of document processors.
  • Broker process hands it off to one of the document processors.
  • The document processor analyzes the document and via a pipeline process, analyzes the bejeezus out of the document and hands it off to an index builder type process.

On the starship FAST, we have a lot of control over the document processing pipeline.  We can mix and match about 100 pipeline components and, most interestingly, we can write our own components.  Like I say, FAST is analyzing documents every which way but Sunday and it compiles a lot of useful information about those documents.  Those crazy FAST people are clearly insane and obsessive about document analysis because they have tools and/or strategies to REALLY categorize documents.

So … using FAST in combination with our own custom pipeline component, we can grab all that context information from FAST and feed it back to MOSS.  It might go something like this:

  • Document is fed into FAST from MOSS.
  • Normal crazy-obsessive FAST document parsing and categorization happens.
  • Our own custom pipeline component drops some of that context information off to a database.
  • A process of our own design reads the context information, makes some decisions on how to fit that MOSS document within our IA and marks it up using a web service and the object model.

Of course, no such automated process can be perfect but thanks to the obsessive (and possibly insane-but-in-a-good-way FAST people), we may have a real fighting shot at a truly effective mass load process that does more than just fill up a SQL database with a bunch of barely-searchable documents.

</end>

Subscribe to my blog.

Technorati Tags: , ,

Faceted Search Fence Sitter No More

I had reason today to play about with the codeplex faceted search project today. 

It’s been around for a while, but I hesitated to download and use it for the usual reasons (mainly lack of time), plus outright fear 🙂

If you’re looking to improve your search and explore new options, download it and install it when you have an hour or so of free time.  I followed the installation manual’s instructions and it took me less than 20 minutes to have it installed and working.  It provides value minute zero.

It does look pretty hard to extend.  The authors provide a detailed walk-through for a complex BDC scenario.  I may be missing it, but I wish they would also provide a simpler scenario involving one of the pre-existing properties or maybe adding one new managed property.  I shall try and write that up myself in the next period of time.

Bottom line — in minutes, you can install, configure it, use it and add some pretty cool functionality to your vanilla MOSS search and be a hero 🙂

</end>

Subscribe to my blog.

Technorati Tags:

SharePoint Wildcard Search: “Pro” Is Not a Stem of “Programming”

On the MSDN search forum, people often ask a question like this:

"I have a document named ‘Programming Guide’ but when I search for ‘Pro’ search does not find it."

It may not feel like it, but that amounts to a wildcard search.  The MOSS/WSS user interface does not support wildcard search out of the box.

If you dig into the search web parts, you’ll find a checkbox, "Enable search term stemming".  Stemming is a human-language term.  It’s not a computer language substring() type function.

These are some stems:

  • "fish" is a stem to "fishing"
  • "major" is a stem to "majoring"

These are not stems:

  • "maj" is not a stem to "major"
  • "pro" is not a stem to "programmer"

The WSS/MOSS search engine does support wild card search through the API.  Here is one blog article that describes how to do that: http://www.dotnetmafia.com/blogs/dotnettipoftheday/archive/2008/03/06/how-to-use-the-moss-enterprise-search-fulltextsqlquery-class.aspx

A 3rd party product, Ontolica, provides wild card search.  I have not used that product.

</end>

Subscribe to my blog.

Technorati Tags: