Commit Graph

15 Commits

Author SHA1 Message Date
Josako
2ca006d82c Added excluded element classes to HTML parsing to allow for more complex document parsing
Added chunking to conversion of HTML to markdown in case of large files
2024-08-22 16:41:13 +02:00
Josako
ab38dd7540 - Improvements working with the cloud, minio, graylog and some first bugfixing 2024-08-13 09:04:19 +02:00
Josako
64cf8df3a9 - Improvements to enable deployment in the cloud, mainly changing file access to Minio
- Improvements on RAG logging, and some debugging in that area
2024-08-01 17:35:54 +02:00
Josako
908a2eaf7e - Improve annotation algorithm for Youtube (and others)
- Patch Pytube
- improve OS deletion of files and writing of files
- Start working on Claude
- Improve template management
2024-07-16 14:21:49 +02:00
Josako
ea0127b4b8 Improve algorithms for HTML and PDF processing 2024-07-08 15:20:45 +02:00
Josako
8e1dac0233 Youtube added - further checking required 2024-07-04 08:11:31 +02:00
Josako
be311c440b Improving chat functionality significantly throughout the application. 2024-06-12 11:07:18 +02:00
Josako
27b6de8734 Removing DocumentLanguage, as both System Context and User Context are to be defined on DocumentVersion level.
Finetuning of embedding workers.
2024-06-06 15:26:49 +02:00
Josako
61e1372dc8 Improvements to Document Interface and correcting embedding workers 2024-06-04 14:59:38 +02:00
Josako
6c2e99f467 Realise processing of HTML and improve both HTML & PDF processing giving new tenant information. 2024-05-13 17:18:38 +02:00
Josako
011bdce38d Prepare for html document validation (added wanted tags to tenant) 2024-05-12 21:58:42 +02:00
Josako
a4bf837d67 Start working on chunking en embedding task. Continu with embeddings. 2024-05-08 22:40:55 +02:00
Josako
cd5afa0408 Refactoring finished :-)
eveai_workers now working (with errors ;-) )
Remote debugging now available
2024-05-07 22:51:48 +02:00
Josako
131c609e68 Refactoring part 2
Necessary changes to ensure correct working of eveai_app
2024-05-06 23:07:45 +02:00
Josako
8e5ad5f312 Refactoring part 1
Some changes for workers, but stopped due to refactoring
2024-05-06 21:30:07 +02:00