1.1. The problem of putting code to production

This post is a part of a series of texts aiming to discover and understand patterns and practices that would enable building a production-ready AI data infrastructure. The main focus is on how to evolve data modeling and retrieval in order to enable Large Language Model (LLM) apps and Agents to serve millions of users concurrently.

For a broad overview of the problem and our understanding of the current state of the LLM landscape, check out our previous post

infographic (2).png

In this text, we continue our inquiry into what would constitute:

  1. Proper data engineering methods for LLMs
  2. A production-ready generative AI data platform that unlocks AI assistants/Agent Networks

To explore these points, we here at prometh.ai have partnered with dlthub in order to productionize a common use case — complex PDF processing — progressing level by level.

In the previous text, we wrote a simple script that relies on the Weaviate Vector database to turn unstructured data into structured data and help us make sense of it.

In this post, some of the shortcomings from the previous level will be addressed, including::

  1. Containerization
  2. Data model
  3. Data contract
  4. Vector Database retrieval strategies
  5. LLM context and task generation
  6. Dynamic Agent behavior and Agent tooling