Week 7 Summary

This week  we worked on integrating the testing via promptfoo as a github-action. This means that every time the prompts are edited in some way, we will run a test script to ensure that the quality was not diminished. It will enable us to more systematically measure improvements in the actual prompts.

For this to work as intended we needed to refactor our codebase to store prompts in a single directory, which we also started working on.

Further, we also worked on some minor UX improvements. For example we now have more nicely formatted links provided by our LangChain bot, as well as further additions to some settings menus for the chatbots.

Overall we are satisfied with our progress, even if there still are some tasks to finish. Next week we will start to work on our final presentation, and keep fixing various issues. We may also explore the possibilities of using a local model like LLaMa-2 rather than an API like with GPT-3/4, which may be preferable in a healthcare context.