Fun Project: VMRecommender

Optimize Your Virtual Infrastructure with AI-Powered Recommendations

Getting started

The design and deployment of robust and efficient network infrastructure is a critical aspect of modern IT operations. Traditionally, this process has been time-consuming and resource-intensive, requiring manual configuration and a deep understanding of networking principles. As businesses increasingly rely on complex and dynamic network architectures, the need for automated and flexible solutions becomes ever more pressing.

Image generated on playground.ai

This project addresses this challenge by introducing a groundbreaking dynamic network diagram generator powered by Retrieval Augmented Generation (RAG) technology. Our system leverages the power of artificial intelligence to automate the network design process, significantly reducing manual effort and accelerating deployment times. At its core lies a comprehensive database of virtual machine (VM) types, each with detailed specifications and capabilities.

By taking customer specifications as input, our RAG-based system intelligently retrieves relevant VM information from the database and generates a complete network diagram. This includes not only the allocation of VMs but also crucial details such as VLAN configuration and specific VM type selections. This approach promises a highly automated and flexible solution, capable of tailoring network diagrams to diverse customer needs with unprecedented efficiency and accuracy.

Let’s get dirty

Problem statement

The challenge lies in bridging the gap between user needs and available machine configurations. Users often express their requirements in natural language, without any knowledge of the specific machine types or configurations available. This presents a significant obstacle in automating the process of translating user descriptions into concrete technical setups.

Solution

Our objective is to develop a robust Retrieval Augmented Generation (RAG) pipeline capable of effectively addressing this challenge. This pipeline will take unstructured user descriptions as input and leverage a database of pre-defined machine configurations. By intelligently retrieving relevant information from the database and employing generative capabilities, the RAG system will generate a set of configurations that best match the user’s described needs.

Image generated on playground.ai

This project aims to create a system that can accurately interpret user requirements, even when expressed in vague or imprecise language, and translate them into actionable machine configurations. This will significantly streamline the process of provisioning resources and empower users to easily access the computing power they need, regardless of their technical expertise.

Code Implementation

In this section, I will only focus on the technologies that I used and explain the key components of the implementation. I will leave the code below.

For choice of technologies, I prefer FAISS and TinyLlama for implementation.

  • FAISS (Facebook AI Similarity Search) is the ideal choice for the retrieval stage due to its exceptional efficiency in performing similarity searches. It’s ability to quickly identify the most relevant machine configurations from the database based on the user’s description is crucial for the system’s responsiveness and accuracy.

  • TinyLlama is an open-source language model designed for efficient inference on CPUs. This choice is driven by the need for cost-effectiveness and accessibility. TinyLlama’s compact size and CPU-friendliness enable me to deploy a functional RAG system without relying on expensive hardware resources, making it a practical and sustainable solution.

To implement FAISS, I took leverage from the datasets library. I constructed the code as follows:

from datasets import load_dataset, Dataset
import pandas as pd
from transformers import AutoTokenizer, AutoModel
import torch

device = torch.device("cpu")

class FAISS:
    def __init__(self, config, data_dir, **kwargs): 

        self.config = config
        model_id = self.config['model_id']
        self.tokenizer = AutoTokenizer.from_pretrained(model_id)
        self.model = AutoModel.from_pretrained(model_id).to(device)
        self.data_dir = data_dir

    def query(self, input, k:int=5, return_pd:bool=True): 
        csv_dataset = self._preparing_csv_dataset(self.data_dir)
        embedding_dataset = self.get_embedding_dataset(csv_dataset)
        question_embedding = self.get_embeddings([input]).detach().numpy()
        embedding_dataset.add_faiss_index(column="embeddings")
        scores, samples = embedding_dataset.get_nearest_examples(
            "embeddings", question_embedding, k=k
        )
        if return_pd: 
            samples_df = pd.DataFrame.from_dict(samples)
            samples_df["scores"] = scores
            samples_df.sort_values("scores", ascending=False, inplace=True)
            return samples_df
        else: 
            return {"samples": samples, 
                    "scores": scores}

    def _preparing_csv_dataset(self, dir:str, 
                            keep_column:list[str]=['Platform', 'Distributor', 'Description', 'Min_CPU', 'Min_RAM_GB', 'Min_Storage_GB', 'Installed_Software', 'Image_Version']
                            )-> pd.DataFrame: 
        if not '.csv' in dir: 
            return None
        dataset = load_dataset('csv', data_files=dir)
        columns = dataset.column_names
        columns_to_remove = set(keep_column).symmetric_difference(columns['train'])
        issues_dataset = dataset.remove_columns(columns_to_remove)
        issues_dataset.set_format(type="pandas")
        df = issues_dataset['train'][:]
        return df

    def _concatenate_text(self, examples):
        return {
            "text": examples["Platform"] + " " + 
                examples["Distributor"] + " " + 
                examples["Description"] + " " + 
                "Min CPU: " + str(examples["Min_CPU"]) + " cores " + 
                "Min RAM: " + str(examples["Min_RAM_GB"]) + " GB " + 
                "Min Storage: " + str(examples["Min_Storage_GB"]) + " GB " +
                "Installed Software: " + examples["Installed_Software"] + " " + 
                "Image Version: " + examples["Image_Version"]
        }

    def cls_pooling(self, model_output):
        return model_output.last_hidden_state[:, 0]

    def get_embeddings(self, text_list):
        encoded_input = self.tokenizer(
            text_list, padding=True, truncation=True, return_tensors="pt"
        )
        encoded_input = {k: v.to(device) for k, v in encoded_input.items()}
        model_output = self.model(**encoded_input)
        return self.cls_pooling(model_output)

    def get_embedding_dataset(self, df: pd.DataFrame): 
        comments_dataset = Dataset.from_pandas(df)
        comments_dataset = comments_dataset.map(self._concatenate_text)

        embedding_dataset = comments_dataset.map(
        lambda x: {"embeddings": self.get_embeddings(x["text"])[0]}
        )
        return embedding_dataset

To implement the TinyLlama, which I called the TinyAgent , I implemented like:

from typing import Union
from transformers import pipeline

PROMPT_TEMPLATE = """
Your task is to design and implement a workflow that generates a network diagram featuring
virtual machines (VMs) based on customer specifications. The workflow should use a database
of VM types, detailing their descriptions, minimum specifications, and installed software 
that represents the network diagram, including VLANs and the specified
machines.

Question: {question} 

Context: {context} 

If you do not have any related information, please answer you don't know!

Answer:

"""

class TinyAgent: 
    def __init__(self, config): 
        self.config = config
        self.messages = [
            {"role": "system", 
             "content": "You are a chatbot that specializes in recommending appropriate virtual machine setting for client."}
        ]

    def query(self, input, context:Union[str, None]=None, is_text:bool=True):
        pipe = pipeline(task=self.config['task'], 
                        model=self.config['model_id'])
        if context is None: 
            context = "There is no additional information."
        input = PROMPT_TEMPLATE.format(question=input, 
                                        context=context)

        self.messages.append({"role": "user", 
                              "content": input})

        prompt = pipe.tokenizer.apply_chat_template(
            self.messages, tokenize=False, add_generation_prompt=True
        )

        outputs = pipe(prompt,
            **self.config['settings']
        )

        if is_text: 
            return outputs[0]['generated_text']

        return outputs

Let’s create a result and test whether or not our pipeline works properly:

from vmrec.faiss import FAISS
from vmrec.agent import TinyAgent
from vmrec.utils import yaml_load


config = yaml_load(dir='./config.yaml')

faiss_storing = FAISS(config=config['faiss_settings'], 
                      data_dir='./___.csv')

llm = TinyAgent(config=config['chat_model'])


if __name__ == "__main__": 
    user_query = "I want a network diagram that illustrates how our on-premises infrastructure connects to our cloud services. It should show how our servers communicate with our cloud-based applications and databases, including any VPNs or dedicated connections. The diagram should also highlight any security measures we have in place for cloud access, like multi-factor authentication or cloud-based firewalls. Finally, I want to understand how our cloud provider's network architecture integrates with our own network."
    context = faiss_storing.query(input=user_query)

    # context's type is pd.DataFrame, convert it into str
    addtional_info = ""
    for _, row in context.iterrows():
        addtional_info += row.text + "\n"

    result = llm.query(input=user_query, 
                       context=context)    

    # Extract information from the result
    result = result.split("<|assistant|>")[-1]
    print(result)

"""
To generate a network diagram using virtual machines (VMs) with their corresponding network settings, we would:

1. Collect all the virtual machines of interest from our cloud provider. 

2. Create a database for storing VM types, descriptions, minimum specifications, and installed software.

3. Develop a Python script to extract the network information from each VM using the available APIs.

4. Use Python libraries like NetworkX to represent the network diagram using VLANs and network configurations.

5. Create high-quality network diagrams using the network representation we have built.


In this diagram, we have a standard network topography with the virtual machines in different VLANs, and VLAN 25, which represents the internal WAN network, is connected to VLAN 13 for internal traffic. The routers associated with each VLAN and the physical network interface cards of each VM are also included for easier visualization. The firewall in our solution is represented in the diagram by VLAN 5000.

Hope this helps! Let me know if you have any other questions.
"""

To be honest, the result is not exactly what I had imagined in my head. I believe that one of the biggest problems is matching the user’s description to the configuration that I have in the dataset. In addition, I want the result to have more structure, like Python’s dictionary with keys that hold information (hostname, gateway, netmask, etc.) and the values that hold the detailed information that is stored in the database. To overcome this, I could define the output format via some prompting techniques’, for instance, by providing several examplars following the Chain-of-Thought approach.


Thank you for reading this article; I hope it added something to your knowledge bank! Just before you leave:

👉 Be sure to clap and follow me. It would be a great motivation for me.

👉The implementation refers to this Github

👉Follow me: LinkedIn | Facebook | Github


Conclusion

This project offers several advantages. Primarily, the use of TinyLlama for CPU-based inference makes the system accessible to a wider audience, removing the barrier of GPU dependency and associated costs. FAISS’s efficiency in similarity search ensures rapid retrieval of relevant configurations, leading to faster response times and a more interactive user experience. Moreover, the RAG approach allows for high customization and flexibility, enabling adaptation to diverse domains and use cases by simply updating the machine configuration database and fine-tuning the language model.

However, some disadvantages must be considered. TinyLlama’s smaller size compared to larger language models may limit its ability to fully grasp complex user descriptions, potentially leading to less accurate configuration suggestions. The system’s performance is also reliant on the quality and comprehensiveness of the machine configuration database, which requires ongoing maintenance to ensure accuracy and avoid bias. Finally, the need to keep the database up-to-date with new machine types and specifications adds to the project’s maintenance overhead.