Requirements
Functional Requirements
1. User Authentication and Authorization
- Implement OAuth2 for secure authentication using email and password.
- Single Sign On (SSO) will be implemented post-appointment of service principle on Azure, to bypass this blocker initially.
2. File Management
- Users can upload, view, and delete documents in .pdf and .docx formats, initially with potential to include other file types like images.
- Support for handling documents of 100-120 pages per single upload, multiple batches might be required.
- Ability to batch requests for larger documents, though this may result in slower and more expensive requests.
3. AI-Powered Chat Interface for Document Interaction
- Users can interact with the AI using a chat interface to ask questions and receive responses based on the uploaded documents.
- Two chat modes:
- General questions with responses based on pre-trained models.
- Specific queries where responses utilize data from user's uploaded and vectorized documents.
- Users can view their chat history.
4. Prompt Engineering Guidance System
- Provide examples of prompts.
- Enable users to create and reuse their custom prompt templates.
5. Data Isolation
- Individual user data isolation where only the respective user can access their documents and chat data.
6. Admin Panel
- Basic functionalities to add, remove, and edit user profiles.
- Admins can manage user access (invite/revoke) and edit all user details.
- Visibility over all user interactions on the platform.
7. Integration with Azure Document Intelligence
- Utilize Azure Document Intelligence for document extraction and analysis.
8. Integration with Azure Open AI and Alternative LLMs
- Initially integrate with Azure Open AI.
- Design system to allow easy swapping with other pre-trained LLMs like Mistral.
- Consider the cost and compliance impacts of hosting open-source models.
9. Integration with External Email Server
- Support for user registration, verification, and password reset functionalities.
10. Document and Model Updates on Deletion
- Ensure that when a document is deleted, the associated vectorized data and model knowledge are updated to reflect the removal.
11. Encryption of Data
- All stored and transmitted data must be encrypted to ensure security and privacy.
12. Dashboard for Token and Cost Management
- Provide a dashboard to monitor and display API token consumption.
- Include features to estimate data ingestion and interaction costs prior to executing operations.