Skip to content

Commit

Permalink
Copilot Chat: support multiple document import (#1675)
Browse files Browse the repository at this point in the history
### Motivation and Context
<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
Copilot Chat currently only supports importing a single document at a
time. Supporting multiple documents will improve user experience.

### Description
<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
1. Add multi-document support in DocumentImportController. Did a little
refactoring too.
2. Create a configurable limit on the number of documents that can be
imported at a time. It's currently set to 10.
3. Enable support in the webapp, both drag&drop and file explorer.
Update the document history item to show multiple files.
4. Update the import document console app to support multi-doc import.

![image](https://github.com/microsoft/semantic-kernel/assets/12570346/64e025fb-de71-4bef-9903-08ad570c5e1e)

Future work:
https://github.com/orgs/microsoft/projects/852/views/1?pane=issue&itemId=31798351

### Contribution Checklist
<!-- Before submitting this PR, please make sure: -->
- [ ] The code builds clean without any errors or warnings
- [ ] The PR follows SK Contribution Guidelines
(https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
- [ ] The code follows the .NET coding conventions
(https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions)
verified with `dotnet format`
- [ ] All unit tests pass, and I have added new tests where possible
- [ ] I didn't break anyone 😄

---------

Co-authored-by: Aman Sachan <51973971+amsacha@users.noreply.github.com>
  • Loading branch information
TaoChenOSU and amsacha authored Jul 11, 2023
1 parent 5e70eef commit 36f4e7c
Show file tree
Hide file tree
Showing 14 changed files with 448 additions and 155 deletions.
50 changes: 30 additions & 20 deletions samples/apps/copilot-chat-app/importdocument/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
namespace ImportDocument;

/// <summary>
/// This console app imports a file to the CopilotChat WebAPI document memory store.
/// This console app imports a list of files to the CopilotChat WebAPI document memory store.
/// </summary>
public static class Program
{
Expand All @@ -26,30 +26,30 @@ public static void Main(string[] args)
return;
}

var fileOption = new Option<FileInfo>(name: "--file", description: "The file to import to document memory store.")
var filesOption = new Option<IEnumerable<FileInfo>>(name: "--files", description: "The files to import to document memory store.")
{
IsRequired = true
IsRequired = true,
AllowMultipleArgumentsPerToken = true,
};

// TODO: UI to retrieve ChatID from the WebApp will be added in the future with multi-user support.
var chatCollectionOption = new Option<Guid>(
name: "--chat-id",
description: "Save the extracted context to an isolated chat collection.",
getDefaultValue: () => Guid.Empty
);

var rootCommand = new RootCommand(
"This console app imports a file to the CopilotChat WebAPI's document memory store."
"This console app imports files to the CopilotChat WebAPI's document memory store."
)
{
fileOption, chatCollectionOption
filesOption, chatCollectionOption
};

rootCommand.SetHandler(async (file, chatCollectionId) =>
rootCommand.SetHandler(async (files, chatCollectionId) =>
{
await UploadFileAsync(file, config!, chatCollectionId);
await ImportFilesAsync(files, config!, chatCollectionId);
},
fileOption, chatCollectionOption
filesOption, chatCollectionOption
);

rootCommand.Invoke(args);
Expand Down Expand Up @@ -97,17 +97,20 @@ private static async Task<bool> AcquireUserAccountAsync(
}

/// <summary>
/// Conditionally uploads a file to the Document Store for parsing.
/// Conditionally imports a list of files to the Document Store.
/// </summary>
/// <param name="file">The file to upload for injection.</param>
/// <param name="files">A list of files to import.</param>
/// <param name="config">Configuration.</param>
/// <param name="chatCollectionId">Save the extracted context to an isolated chat collection.</param>
private static async Task UploadFileAsync(FileInfo file, Config config, Guid chatCollectionId)
private static async Task ImportFilesAsync(IEnumerable<FileInfo> files, Config config, Guid chatCollectionId)
{
if (!file.Exists)
foreach (var file in files)
{
Console.WriteLine($"File {file.FullName} does not exist.");
return;
if (!file.Exists)
{
Console.WriteLine($"File {file.FullName} does not exist.");
return;
}
}

IAccount? userAccount = null;
Expand All @@ -120,11 +123,12 @@ private static async Task UploadFileAsync(FileInfo file, Config config, Guid cha
}
Console.WriteLine($"Successfully acquired User ID. Continuing...");

using var fileContent = new StreamContent(file.OpenRead());
using var formContent = new MultipartFormDataContent
using var formContent = new MultipartFormDataContent();
List<StreamContent> filesContent = files.Select(file => new StreamContent(file.OpenRead())).ToList();
for (int i = 0; i < filesContent.Count; i++)
{
{ fileContent, "formFile", file.Name }
};
formContent.Add(filesContent[i], "formFiles", files.ElementAt(i).Name);
}

var userId = userAccount!.HomeAccountId.Identifier;
var userName = userAccount.Username;
Expand Down Expand Up @@ -153,6 +157,12 @@ private static async Task UploadFileAsync(FileInfo file, Config config, Guid cha
// Calling UploadAsync here to make sure disposable objects are still in scope.
await UploadAsync(formContent, accessToken!, config);
}

// Dispose of all the file streams.
foreach (var fileContent in filesContent)
{
fileContent.Dispose();
}
}

/// <summary>
Expand Down Expand Up @@ -185,7 +195,7 @@ private static async Task UploadAsync(
try
{
using HttpResponseMessage response = await httpClient.PostAsync(
new Uri(new Uri(config.ServiceUri), "importDocument"),
new Uri(new Uri(config.ServiceUri), "importDocuments"),
multipartFormDataContent
);

Expand Down
9 changes: 7 additions & 2 deletions samples/apps/copilot-chat-app/importdocument/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,18 +32,23 @@ Importing documents enables Copilot Chat to have up-to-date knowledge of specifi
4. **Run** the following command to import a document to the app under the global document collection where
all users will have access to:

`dotnet run -- --file .\sample-docs\ms10k.txt`
`dotnet run --files .\sample-docs\ms10k.txt`

Or **Run** the following command to import a document to the app under a chat isolated document collection where
only the chat session will have access to:

`dotnet run -- --file .\sample-docs\ms10k.txt --chat-id [chatId]`
`dotnet run --files .\sample-docs\ms10k.txt --chat-id [chatId]`

> Note that this will open a browser window for you to sign in to retrieve your user id to make sure you have access to the chat session.
> Currently only supports txt and pdf files. A sample file is provided under ./sample-docs.
Importing may take some time to generate embeddings for each piece/chunk of a document.

To import multiple files, specify multiple files. For example:

`dotnet run --files .\sample-docs\ms10k.txt .\sample-docs\Microsoft-Responsible-AI-Standard-v2-General-Requirements.pdf`

5. Chat with the bot.

Examples:
Expand Down
Loading

0 comments on commit 36f4e7c

Please sign in to comment.