Microsoft Knowledge Base Email Alertz

Entire contents of a file that is crawled by Microsoft SharePoint may not be searchable

Search KbAlertz

Advanced Search

Receive Microsoft Knowledge Base articles by E-Mail?

Every night we scan the Microsoft Knowledge Base. If technologies you're interested in are updated, we'll send you an e-mail. You only get one e-mail a day, and only when new articles are added.

Click here to create a
FREE account
Already have an account?
[Click here to Login]











Microsoft Knowledge Base Article

This article contents is Microsoft Copyrighted material.
©2005-©2007 Microsoft Corporation. All rights reserved. Terms of Use | Trademarks

Article ID: 970776 - Last Review: May 23, 2011 - Revision: 2.0

Entire contents of a file that is crawled by Microsoft SharePoint may not be searchable

Source: Microsoft Support

RAPID PUBLISHING

RAPID PUBLISHING ARTICLES PROVIDE INFORMATION DIRECTLY FROM WITHIN THE MICROSOFT SUPPORT ORGANIZATION. THE INFORMATION CONTAINED HEREIN IS CREATED IN RESPONSE TO EMERGING OR UNIQUE TOPICS, OR IS INTENDED SUPPLEMENT OTHER KNOWLEDGE BASE INFORMATION.

Symptom

The entire contents of a file that is crawled by Microsoft SharePoint may not be searchable.

Cause

The chunk buffer size used by the Search service may be too small.

Resolution

As the actual textual data in files may vary, it is impossible to predict what buffer size will work for all files. For example, a 10MB PPT file may not require a larger chunk buffer whereas a 4MB TXT file containing a high percentage of unique terms will. This is because unique words for a PPT could include words that are displayed in images, which does not get indexed. Hence the amount of chunk buffer space needed is directly related to the number of unique words that can be indexed in a file.

The alternative is to increase the buffer size gradually till the entire contents of a file representative of what is in your content source is searchable.

Note increasing these values will cause the search service to consume more memory on the indexer machine and should not be done randomly.

To increase the chunk buffer size, change the following registry values on the Indexer:

Important This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base:

How to back up and restore the registry in Windows 
http://kbalertz.com/Feedback.aspx?kbNumber=322756/


NOTE: For SharePoint 2010 12.0 will show as 14.0 in the registry keys below

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager\CB_ChunkBufferSizeInMegaBytes

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager\CB_MinBytesReservedForDoc

CB_ChunkBufferSizeInMegaBytes is expressed in MB while CB_MinBytesReservedForDoc is expressed in bytes.
 

Because there are other structures aside from raw data that need to be maintained within a chunk buffer, CB_MinBytesReservedForDoc should be approximately 2MB less than CB_ChunkBufferSizeInMegaBytes.

For instance, if CB_ChunkBufferSizeInMegaBytes is increased to 10 (decimal), CB_MinBytesReservedForDoc should be: 8388608 (decimal).

The search service must also be restarted for the changes to take effect.

More Information



Q: What are chunk buffers ?
A:
A Chunk Buffer is a shared block of memory used by Filter Daemons (mssdmn.exe) to provide filtered, word broken data and raw data to the Indexer (mssearch.exe). The Indexer populates the indexes with the data in chunk buffers.
  
Q: What are the Default values?
A: By default, SharePoint Server ships with the following values: 

32bit defaults:
CB_ChunkBufferSizeInMegaBytes Default value: 0x2
CB_MinBytesReservedForDoc Default value: 0xb2000 or 729,088 bytes 

64bit defaults:
CB_ChunkBufferSizeInMegaBytes Default value: 0x8
CB_MinBytesReservedForDoc Default value: 0x300000 or 3,145,728 bytes 

Q: What are the number of chunk buffers created by default?
A: Four (4) 

Q: What parameters needs to be monitored to observe impact of changes?
A: Before modifying these values, using perfmon, locate and note the maximum values for the Process:private bytes and Process:Virtual bytes counters for the MSSearch.exe process after a full crawl.  At a minimum, these values will increase by 4 * CB_ChunkBufferSizeInMegaBytes.  The increased buffer size may negatively impact search service performance/reliability if these values will exhaust available physical memory and/or virtual address space.

DISCLAIMER

MICROSOFT AND/OR ITS SUPPLIERS MAKE NO REPRESENTATIONS OR WARRANTIES ABOUT THE SUITABILITY, RELIABILITY OR ACCURACY OF THE INFORMATION CONTAINED IN THE DOCUMENTS AND RELATED GRAPHICS PUBLISHED ON THIS WEBSITE (THE “MATERIALS”) FOR ANY PURPOSE. THE MATERIALS MAY INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS AND MAY BE REVISED AT ANY TIME WITHOUT NOTICE.

TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, MICROSOFT AND/OR ITS SUPPLIERS DISCLAIM AND EXCLUDE ALL REPRESENTATIONS, WARRANTIES, AND CONDITIONS WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO REPRESENTATIONS, WARRANTIES, OR CONDITIONS OF TITLE, NON INFRINGEMENT, SATISFACTORY CONDITION OR QUALITY, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WITH RESPECT TO THE MATERIALS.
Note This is a "FAST PUBLISH" article created directly from within the Microsoft support organization. The information contained herein is provided as-is in response to emerging issues. As a result of the speed in making it available, the materials may include typographical errors and may be revised at any time without notice. See Terms of Use (http://go.microsoft.com/fwlink/?LinkId=151500) for other considerations.

APPLIES TO
  • Microsoft Office SharePoint Server 2007
  • Microsoft SharePoint Server 2010
Keywords: 
kbrapidpub kbnomt KB970776
       

Community Feedback System

Very often, it takes hours to solve a problem. Very often, you've looked high and low, and have tried a lot of solutions. When you finally found it, chances are, it was because someone else helped you. Here's your chance to give back. Use our community feedback tool to let others know what worked for you and what didn't.

Please also understand that the community feedback system is not warranted to be correct, it's simply a system that we've built to let people try and help each other. If something in a feedback response doesn't make sense to you, or you're not comfortable making changes that the feedback talks about (like registry edits), please consult a professional.

Thank you for using kbAlertz.com Feedback System.

-- Scott Cate