Data Shredding Overview
Data Shredding is a new feature of SharePoint 2013, it is enabled by default. Its purpose is to reduce disk IO for updates to documents. It is particularly targeted at modern MS document formats (docx, xlsx etc) and has mixed results on other document types (particularly compressed data, but also pdfs and older MS formats).
The best case scenario for Data Shredding is medium to large modern MS documents that are frequently updated with versioning turned on. The initial version of a file will take up slightly more space on disk (and thus require slightly greater disk IO), but subsequent updates will only update modified chunks which leads to a long term reduction in both disk IO and total storage. The expected tipping point for a reduction in overall storage size is between 2 and 3 revisions of a document. It is also worth noting that without versioning the overall storage utilisation will always be higher, but the disk IO still will be lower.
The RBS interface has not been changed between SharePoint 2010 and SharePoint 2013. This means an RBS provider such as Stealth Content Store for SharePoint will still externalise the data provided to it as before. As we saw earlier though the data will now be shredded.
Seeing the impact of shredding on data stored within the database is fairly difficult, but when we inspect the external data storage utilised by RBS the impact of shredding becomes more obvious. The following images show a simple text file (Figure 1) that was uploaded to SharePoint 2013 and the resultant externalised blob (Figure 2).
If you look carefully you can see the original data (towards the end of the second line of text), but in this case there is also a large amount of extra data.
In larger files you would expect to see multiple shredded blobs (Figure 3)
Dependent on file size and FileWriteChunkSize you can expect to see blobs containing: User Data, Cobalt metadata or a combination of the two.
So, is Data Shredding worth it?
Well, the jury's still out on this one. If you have versioning enabled and use SharePoint primarily for the storage of your modern MS office documents then it's a definite yes. If you use SharePoint as a file repository for commonly read but rarely modified documents then maybe not.
It is a matter of balancing your number of updates, preference for MS document formats and average file size to find out whether it is worthwhile. Such metrics would need to be taken on a live environment and would require many weeks of minor alterations followed by further testing to fine tune perfectly. Chances are though that an investigation of that magnitude is not worth the time it would take.
Microsoft's recommendation is obvious – they enable shredding by default. Unless your data clearly sits outside the type of files Data Shredding was created for there is no reason not to leave it on.
Doesn't RBS conflict with Data Shredding?
Not really, the two technologies were created to solve completely different problems. RBS externalises data to reduce Content Database size (by 80-95%) and increase SharePoint performance with larger files. Data Shredding reduces disk IO and in some cases has a small reduction in Content Database size.
The two technologies do come into a small amount of conflict though. Although at Stealth we see read/write performance improvements from sub 100kb files the greatest improvements come as the blobs gets bigger. So, RBS likes larger blobs but Data Shredding creates smaller blobs. Fortunately there is the option to alter size of the Shredded chunks using the FileWriteChunkSize, the sweet spot is between 100kb and 2mb and depends again on the number of versions and average file size. More versions means a lower preference, larger files means we move towards the upper end of the range.