Problem
As part of the validation, we must dump all the file details inside a SharePoint drive. Some more points below
FolderName, FileGraphId, FileName
FileGraphId is the unique id of the file that Graph API understands. It is not Guid or integer instead it uses a 34-character wide format like '01XXXXA4QGNCJMT7XXXVDKCNXXXV6LCC4E'
Approaches
There are 3 APIs to interact with SharePoint Online. The first Graph has no direct method to list all the files recursively. We have to go by individual Graph API calls to the list folders and make subsequent Graph calls to get the folders and files inside recursively. It works but is very slow and high chance of getting throttled. There is a search API but that has no guarantee of returning all the files as the indexing will take some time.
If we use the PnP.Core or PnP.Frameworks, there is no way to get the Graph IDs.
Solution
So lastly we have to go via Graph API but get the List items (not drive) then get the driveItem of list item to retrieve the Graph Id. The code is as follows.
var graphClient = new GraphServiceClient(tokenProvider);
//Url for below looks like /v1.0/sites/{siteGraphid}}/lists/{listGuid}/items?$select=contentType&$expand=driveItem HTTP/1.1
var listItems = await graphClient
.Sites[siteGraphId]
.Lists[listGuid] // Need to get the ListId from DriveId
.Items
.GetAsync(req=> {
req.QueryParameters.Expand = new []{"driveItem"};
req.QueryParameters.Select = new []{"contentType"};
});
listItems
.Value
.Where(item=>item.ContentType.Name == "Document")
.ToList()
.ForEach(item=>Console.WriteLine($"FolderName:{item.DriveItem.ParentReference.Name},File Graph Id:{item.DriveItem.Id}, File Name: {item.DriveItem.Name}"));
The drive Id needs to be translated to ListId to get this working. The QueryParameters.Select is used to limit the data to mainly 2 elements: driveItem and contentType.
Full code
The above code gives the idea how the required data can be pulled. Below goes full code that can be run inside Polyglot notebook.
#r "nuget:Microsoft.Graph"
#r "nuget:Azure.Identity"
#r "nuget:CsvHelper"
using Microsoft.DotNet.Interactive;
using Microsoft.Graph;
using Microsoft.Graph.Models;
using Azure.Core;
using Azure.Identity;
using System.Collections.Generic;
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
public class FooMap : ClassMap<(string FolderName,string FileGraphId,string FileName)>
{
public FooMap()
{
Map(m => m.FolderName).Name("FolderName");
Map(m => m.FileGraphId).Name("FileGraphId");
Map(m => m.FileName).Name("FileName");
}
}
TokenCredential tokenprovider = new UsernamePasswordCredential(
"{userName@tenant.onmicrosoft.com}",
"{password}",
"{AAD Tenant Id}",
"{AAD App registration id}");
var siteGraphId="{3 part graph Id of site}";
var listGuid = "{Guid of List}";// Get this using Drive.
var graphClient = new GraphServiceClient(tokenprovider);
//Url for below looks like /v1.0/sites/{siteGraphid}}/lists/{listGuid}/items?$select=contentType&$expand=driveItem HTTP/1.1
ListItemCollectionResponse listItems = await graphClient
.Sites[siteGraphId]
.Lists[listGuid]
.Items
.GetAsync(req=> {
req.QueryParameters.Expand = new []{"driveItem"};
req.QueryParameters.Select = new []{"contentType"};
});
var finalItems = new List<(string FolderName,string FileGraphId,string FileName)>();
var pageIterator = PageIterator<ListItem,ListItemCollectionResponse>
.CreatePageIterator(graphClient,listItems,(item)=>{
if(item.ContentType.Name == "Document")
{
finalItems.Add((FolderName:item.DriveItem.ParentReference.Name,FileGraphId:item.DriveItem.Id,FileName:item.DriveItem.Name));
}
return true;
});
await pageIterator.IterateAsync();
using(var outputFileWriter= new StreamWriter(@"c:\temp\export.csv"))
{
using (var csv=new CsvWriter(outputFileWriter,CultureInfo.InvariantCulture))
{
csv.Context.RegisterClassMap<FooMap>();
csv.WriteRecords(finalItems);
}
}
finalItems.ForEach((item)=>Console.WriteLine($"FolderName:{item.FolderName},File Graph Id:{item.FileGraphId}, File Name: {item.FileName}"))
Each(item=>Console.WriteLine($"FolderName:{item.DriveItem.ParentReference.Name},File Graph Id:{item.DriveItem.Id}, File Name: {item.DriveItem.Name}"));
The authentication used here is the ROPC username password. It needs to be changed based on best practices. It uses a nuget package for writing a list as a CSV file.
No comments:
Post a Comment