Building Scalable and Resilient LLMs for AI Applications: Machine Learning Architectures and Frameworks

McCoyAle
Apr 30
8 min read

In our previous article, "Getting started with ML.NET for Machine Learning: Building LLMs for AI Applications", we flowed through a series of steps to analyze a dataset using the ML.NET CLI to generate ML.Net models and source code, in preparation for consuming models into your AI applications via .NET. Similar to creating a new recipe, building a new house, or creating a new dress, there's a design and craftsmen aspect to determine the best solution, which focuses on the requirements, architecture, and frameworks used to create a master piece. With machine learning of thinking, still holds true.

Within the context of Machine learning, architecture refers to the structure and organization of the components and processes within a machine learning system. It acts as an outline for how the respective data is handled, the models are trained, and predictions are made and monitored throughout its lifecycle. It focuses on the design aspects of the system and the deployment of the models. During this stage, you'll focus on how the data pipelines, algorithms, and computational resources interact with one another to achieve a single or series of integrated outcome(s).

Once there is an understanding of how the interactions should occur, a decision must be made related to the set of tools, their respective libraries, and pre-build structures that will be used to simplify the development process. This collection of tasks is referred to as the framework, which allows development teams to focus on building and training the models instead of complex implementation details, related to best practices.

Let's gain a better understanding of what this means at a lower level.

Key Components and Processes

In earlier articles, such as "Decommissioning AI Systems: Best Practices and Guidelines for Off-boarding Large Language Models and Infrastructure", we've had the opportunity to explore the AI lifecycle and the various phases within, that act as guiding principles to integrate each phase of an AI systems lifecycle. You can review this article to gain insight of the phases within the AI lifecycle. Let's revisit a few of the individual components that are critical to managing the end to end lifecycle within an AI system:

Data Sources: The origin of the data that is used to train and evaluate machine learning models.
Data Pipelines: The processes and tools that transition data from its origin to the models that need it.
Data Quality Management: Tools and processes that ensure data accuracy, completeness, and other established data compliant characteristics.
Training Processes: The methods and algorithms used to build and refine machine learning models.
ML Applications: The tools and systems that use trained models to generate insights and predictions. This is interchangeably used with AI applications.
Compute and Storage Infrastructure: The hardware and storage resources that host the ML system components.
Orchestration Tooling: Tools that manage the various components of the ML architecture and unify them into a coherent pipeline.

Each component plays a key role as data moves throughout a lifecycle phase within a system, to ensure the end user is able to accomplish the respective workflow.

Machine Learning Architecture

Within the micro-service architecture, we are used to isolating specific components in layers that represent presentation (UI), application, domain, and infrastructure implemented through patterns such as the MVC. With LLMs, there is not much of a difference outside of the "how" and with "what". A high level example of layers within an LLM model from source to monitoring, would consist of:

The goal of understanding the best architecture for your LLM is to ensure it is built in a way that accurately achieves the persona's desired outcome and is scalable, reliable, and efficient for future growth and enhancement. As an example, an LLM that is designed to enhance image recognition tasks, might have an architecture that includes a data pipeline for ingesting images, a storage system for large image datasets, a training pipeline for training deep learning models, and a deployment pipeline for making predictions in real-time.

Security, governance, observability, monitoring, and alerting are the considerations within the architecture to ensure secure and resilient build, deploy, manage and sunset of the LLM when deprecation is desired.We can consider ML architecture as the blueprint for the entire ML system, while a framework provides the tools and structures to build and deploy models within that architecture.

Machine Learning Architectures in AI Systems

The best ML architecture is the one that solves the problem at hand. It's important to understand and utilize current architectures, to better identify gaps and limitations or opportunities for new architectures that solve complex and innovative new ideas. A key aspect to determining the appropriate architecture is understanding the best approach, associated with the algorithms, for your use case. A few of the common approaches include:

Supervised Learning: Models that learn from labeled data to make a predication about something.
- In the healthcare field, these models might use patient data to predict a health disparity or a patient being at risk.
- As it relates to image recognition, these models use classification to categorize images within a category of common characteristics.
Unsupervised Learning: These models discover patterns in unlabeled data.
- Using criteria of a customer's habits and patterns to make recommendations. Typically referred to as recommendation systems.
- Ideal for anomoly detection or identifying outliers. Think about the alert sent to you when there is abnormal behavior within your financial accounts.
Semi-Supervised Learning: Models learn from a mix of labeled and unlabeled data.
- This is ideal for use cases where relative input is used to predict an output, where generalizations need to be made or labeling is costly.
Reinforcement Learning: Models learn through trial and error, receiving rewards or penalties for their actions.
- This is where "agents" that learn from their environments, and make a decision. IMO, this is at the lowest level, similar to what humans do. Use trial and error approaches to make decisions, learn from the outcome based on some reward signal, and use this to make better decisions moving forward.
- I like to relate this to past psychology training, as it relates to Paslov's theory surrounding "Classical Conditioning", connecting neutral stimulus to a positive one.

When there is a firm understanding of the use case, and type of data that is used to construct the output to some prompt or request, this will help to guide the decisions surrounding the best architectural and design patters to use.

Examples of Model Architectures

Neural Networks: These are inspired by the structure of the human brain and are used for a wide variety of tasks.

Feedforward Neural Networks (FNNs): Data flows in one direction, from input to output.
Convolutional Neural Networks (CNNs): Specialized for processing image and video data.
Recurrent Neural Networks (RNNs): Designed to process sequential data, like text or time series.
Transformers: A powerful architecture for natural language processing and other tasks.

Decision Trees: Models that use a tree-like structure to make predictions.

Within it, nodes will represent feature tests, while branches represent decision rules, and leaf nodes contain the final predictions.

Support Vector Machines (SVMs): Supervised algorithms that find the best separation between data points.

K-means Clustering: An unsupervised learning algorithm for grouping unlabeled datasets, into different clusters, implemented through cluster centers (centroids) and cluster membership concepts.

In these models, a number of clusters to form is selected, while random datapoints are selected as initial cluster centers. The distance between each datapoint and each centroid is calculated, and used to assign the datapoint to the nearest centroid, calculating the mean of all datapoints to recompute the centroid of each cluster.

Auto-encoders: Neural networks that learn to compress and reconstruct data. This architecture consists of three main parts:

Encoders - Responsible for transforming input data into, whats referred to as the "latent space" or "encoding".
Bottleneck Layer - A hidden layer that represents the compressed coding of the input data.
Decoder - Responsible for taking the encoded representation from the bottleneck layer, and expanding it back to the original input's dimensionality. In simplified terms, it accurately reconstructs the input data.

Generative Adversarial Networks (GANs): Two neural networks compete to generate realistic data, used for unsupervised learning.

Typically consists of a generator and a discriminator as basic multi-layer perceptrons.

Retrieval-Augmented Generation (RAG): Used in Natural Language Processing (NLP) which pulls external data for real-time insights.

Cache-Augmented Generation (CAG): Preloads knowledge, in comparison to RAG, for faster and more efficient responses in integrating Generative AI.

Machine Learning Frameworks

This part of Machine Learning, at least for me, still needs to be fleshed out. It is common that, and for example, a neural network focused on functioning similar to the human brain, is a framework. However, it feels naturally like a design pattern, where there are frameworks that fall within it.

As previously mentioned, frameworks typically include the tools and libraries that simplify the implementation of ML algorithms within your Artificial Intelligence (AI) system. You can expect them to function similar to that of a "toolkit", including pre-built functions for data preprocessing, model building, training, evaluation, and optimization. Its main focus is to accelerate the development of ML models by abstracting away the complexity of implementing algorithms. Your development and engineering team's best friends.

Types of Machine Learning Frameworks

When it comes to assessing and using a specific framework, its important to align the underlying dataset(s), architectural, and design patterns with a toolkit that can effectively and efficiently process and transform data into actionable insights and productive outputs (responses). Therefore, it's not ideal to simply suggest a framework, but to understand the business or use case requirements to suggest the appropriate solution. A few of the most common used Machine Learning frameworks within AI systems are:

PyTorch

PyTorch consists of an entire ecosystem, which includes solutions for modeling, training, and optimization of your AI systems. They provide the tooling, providing the necessary tools, packages, and algorithms to train models with various machine learning approaches, within different contextual settings. Meaning, if you are looking for federated or continuous learning, there's an option. If your packages and algorithms focus more on enhanced privacy techniques, then there are training options for that too.

TensorFlow

Tensorflow is an end to end platform for machine learning, where their ecosystem consists of a repository of models ready for find-tunding and datasets, to abstract and analyze data with. If you are looking to probe trained machine learning models code-free, There's a great "What-If" tool available within their catalog.

Scikit-Learn

If your use case is specific to supervised or unsupervised learning, scikit learn may be a great option to explore. Deep Learning and Reinforcement Learning are out of scope for this product, but they do offer a multilayer perceptron model, in support of supervised and unsupervised neural network models.

Apache Spark

Use case requirements that require enhanced performance on SQL queries, batching, streaming, or code migration for ML algorithms training on hardware with limited resources to nodal machines or clusters, Spark a great solution for large-scale data analytics oriented tasks.

Caffee

Specific to Deep Learning Frameworks, Caffee was developed by the Berkeley AI Research providing capabilities to train migration activities of models between CPU and GPU hardware, among other capabilities that limit hard-coding.

Matlab

A simplified solution where classification of data can be done through their Classification and Regression Learning Apps, where they train models to classify data and train regression models to predict data, respectively.

The implementation of the design and architectural decisions made prior to building your AI or ML solution, is critical to the user experience. It's important to assess the type of data to ensure the right architectual classification/type is selected. This is in combination with choosing the right architectural pattern and frameworks to use during your build and deploy phases. We can only accomplish this success through efficiently capturing requirements and aligning those with the accurate set of tooling and processes.

Final Thoughts

We are currently as a period where innovation is fast paced and local culture is shifting towards, what has been termed as, "The Age of Insights". Where developing insights that are actionable, is key to long-term management and scalability, it's important to understand how we can build Machine Learning Models, that are easy to maintain in scalable and secure methods.

With a market filled with solutions and other products, centered around doing so, it's important to understand what this means to your business requirements and use cases to support the needs of your target audience. With governance structures continuing to be established and enhanced, security concerns related to the handling of data always being a concern, its important to ensure the relevant practices are used in alignment with the tooling used to implement them.

A.M. Tech Consulting