First Part in a series on how to periodically sync data from a SharePoint site to a Database
My workplace had a business need of periodically exporting large amounts of data from a number of SharePoint lists to an SQL database.
A number of tools exist that allowed me to do this, However, none of them are completely free or offer an all encompassing functionality.
So, I decided to build one by myself. C# was my language of choice for this app; the general flow of which will be as following:
- Store the details on which lists to sync, when to sync, the database credentials, and the SharePoint credentials in a configuration file.
- Get details for each List that needs to be synced, including the Field information (Name, Type etc.).
- Create a Table in the database for each List, containing all of the required fields as columns.
- Get a list of items from SharePoint.
- Add those items to the database Table if the items do not already exist.
- Update those items in the database Table if the items already exist.
- Delete any items from the database that are no longer present in the SharePoint List.
For database operations, we can simply use ADO.NET
. To get data from SharePoint, we have a couple of options.
Method 1: Using the SharePoint Database Structure
SharePoint stores all of the list data in a content database in a table called AllUserData
, which has the following structure:
TABLE AllUserData (
tp_Id int NOT NULL,
tp_ListId uniqueidentifier NOT NULL,
tp_SiteId uniqueidentifier NOT NULL,
tp_RowOrdinal tinyint NOT NULL DEFAULT ((0)),
tp_Version int NOT NULL,
tp_Author int NULL,
tp_Editor int NULL,
tp_Modified datetime NULL,
tp_Created datetime NULL,
... ... ...
bit1 - 1000 bit SPARSE NULL,
datetime1 - 550 datetime SPARSE NULL,
float1 - 550 float SPARSE NULL,
int1 - 750 int SPARSE NULL,
ntext1 - 262 nvarchar(max) SPARSE NULL,
nvarchar1 - 262 nvarchar(255) SPARSE NULL,
sql_variant1 - 262 sql_variant SPARSE NULL,
uniqueidentifier1 -350 uniqueidentifier SPARSE NULL,
geography1 - 2 varbinary(8000) SPARSE NULL,
tp_ColumnSet xml COLUMN_SET FOR ALL_SPARSE_COLUMNS NULL
)
Using tp_ListId
and tp_SiteId
, one can filter out the data for a certain List on a certain Site. The field data is stored in the sparse columns.
The next step is to find the mapping between each Field and the column which contains its data. This can be found in the AllLists
table:
TABLE AllLists (
tp_Id uniqueidentifier NOT NULL,
tp_ListId uniqueidentifier NOT NULL,
tp_SiteId uniqueidentifier NOT NULL,
tp_Title nvarchar(255) NOT NULL,
... ... ...
tp_Fields tCompressedString NULL
)
Field information for each list is present in the tp_fields
column. The issue here is that the data in this column is compressed. So we need to first uncompress the data and then parse the XML to get this information. Microsoft Docs
Method 2: Using the SharePoint CSOM Libraries
Microsoft provides C# libraries that can be used to connect to a SharePoint instance and perform CRUD operations. They call it the SharePoint Client Object Model (CSOM).
Using this library, we can easily get a collection of Lists present on SharePoint, and their fields.
Below is an example of how to get List details. Microsoft Docs
ClientContext context = new ClientContext("https://{site_url}");
Web web = context.Web;
context.Load(web.Lists,
lists => lists.Include(list => list.Title,
list => list.Id));
context.ExecuteQuery();
foreach (List list in web.Lists)
Console.WriteLine($"{list.Id}: {list.Title}");
Method 3: Using the REST API
SharePoint also provides a REST API. We can get information on Lists, their fields, and the item by using ODATA queries.
The following is an example on how a request is formed and the data that is returned. Microsoft Docs
GET https://{site_url}/_api/web/lists/GetByTitle('List Title')
Authorization: "Bearer " + accessToken
Accept: "application/json;odata=verbose"
<content type="application/xml">
<m:properties>
<d:AllowContentTypes m:type="Edm.Boolean">true</d:AllowContentTypes>
<d:BaseTemplate m:type="Edm.Int32">100</d:BaseTemplate>
<d:BaseType m:type="Edm.Int32">0</d:BaseType>
<d:ContentTypesEnabled m:type="Edm.Boolean">false</d:ContentTypesEnabled>
<d:Created m:type="Edm.DateTime">2012-06-26T23:15:58Z</d:Created>
<d:DefaultContentApprovalWorkflowId m:type="Edm.Guid">00000000-0000-0000-0000-000000000000</d:DefaultContentApprovalWorkflowId>
<d:Description>A list created by Project Based Retention used to store Project Policy Items.</d:Description>
<d:Direction>none</d:Direction>
<d:DocumentTemplateUrl m:null="true" />
<d:DraftVersionVisibility m:type="Edm.Int32">0</d:DraftVersionVisibility>
<d:EnableAttachments m:type="Edm.Boolean">true</d:EnableAttachments>
<d:EnableFolderCreation m:type="Edm.Boolean">false</d:EnableFolderCreation>
<d:EnableMinorVersions m:type="Edm.Boolean">false</d:EnableMinorVersions>
<d:EnableModeration m:type="Edm.Boolean">false</d:EnableModeration>
<d:EnableVersioning m:type="Edm.Boolean">false</d:EnableVersioning>
<d:EntityTypeName>ProjectPolicyItemList</d:EntityTypeName>
<d:ForceCheckout m:type="Edm.Boolean">false</d:ForceCheckout>
<d:HasExternalDataSource m:type="Edm.Boolean">false</d:HasExternalDataSource>
<d:Hidden m:type="Edm.Boolean">true</d:Hidden>
<d:Id m:type="Edm.Guid">74de3ff3-029c-42f9-bd2a-1e9463def69d</d:Id>
<d:ImageUrl>/_layouts/15/images/itgen.gif</d:ImageUrl>
<d:IrmEnabled m:type="Edm.Boolean">false</d:IrmEnabled>
<d:IrmExpire m:type="Edm.Boolean">false</d:IrmExpire>
<d:IrmReject m:type="Edm.Boolean">false</d:IrmReject>
<d:IsApplicationList m:type="Edm.Boolean">false</d:IsApplicationList>
<d:IsCatalog m:type="Edm.Boolean">false</d:IsCatalog>
<d:IsPrivate m:type="Edm.Boolean">false</d:IsPrivate>
<d:ItemCount m:type="Edm.Int32">0</d:ItemCount>
<d:LastItemDeletedDate m:type="Edm.DateTime">2012-06-26T23:15:58Z</d:LastItemDeletedDate>
<d:LastItemModifiedDate m:type="Edm.DateTime">2012-06-26T23:15:59Z</d:LastItemModifiedDate>
<d:ListItemEntityTypeFullName>SP.Data.ProjectPolicyItemListItem</d:ListItemEntityTypeFullName>
<d:MultipleDataList m:type="Edm.Boolean">false</d:MultipleDataList>
<d:NoCrawl m:type="Edm.Boolean">true</d:NoCrawl>
<d:ParentWebUrl>/</d:ParentWebUrl>
<d:ServerTemplateCanCreateFolders m:type="Edm.Boolean">true</d:ServerTemplateCanCreateFolders>
<d:TemplateFeatureId m:type="Edm.Guid">00bfea71-de22-43b2-a848-c05709900100</d:TemplateFeatureId>
<d:Title>Project Policy Item List</d:Title>
</m:properties>
</content>
The Setup
I decided to go with the CSOM Library. The first method will not work with SharePoint Online, and the third method does not give us strongly typed objects for working with SharePoint lists.
CSOM works with both OnPremise and Online SharePoint implementations and also provides methods and classes.
- Language: C#
- Database Operations: ADO.NET
- SharePoint Data Operations: SharePoint CSOM Library
- IDE: Visual Studio 2019 Community
In the next post, we will go over setting up the SharePoint connection, and creating functions to get List Details and List Items.
Top comments (0)