Python AI Stack 2024: Tổng kết công cụ và thư viện thiết yếu

Năm 2024 chứng kiến sự bùng nổ của các công cụ AI trong Python. Bài viết này tổng kết những libraries và tools thiết yếu mà mọi AI developer cần biết, từ LLM orchestration đến vector databases.

Nội dung chính

LLM Orchestration Frameworks

LangChain

Framework phổ biến nhất cho việc xây dựng LLM applications:

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain

# Basic usage
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant specialized in {topic}"),
    ("user", "{question}")
])

chain = prompt | llm
result = chain.invoke({"topic": "Python", "question": "Explain decorators"})

# With memory
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()

# Agents với tools
from langchain.agents import create_react_agent
from langchain.tools import DuckDuckGoSearchRun

tools = [DuckDuckGoSearchRun()]
agent = create_react_agent(llm, tools, prompt)

Khi nào dùng: Khi cần flexibility, nhiều integrations, complex chains

LlamaIndex

Tối ưu cho RAG (Retrieval Augmented Generation):

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI

# Load documents
documents = SimpleDirectoryReader("data/").load_data()

# Create index
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is the company's refund policy?")
print(response)

# Advanced: Custom LLM
llm = OpenAI(model="gpt-4-turbo", temperature=0)
query_engine = index.as_query_engine(llm=llm)

Khi nào dùng: Document Q&A, knowledge bases, RAG systems

Haystack

Production-ready NLP framework:

from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator

pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store))
pipe.add_component("generator", OpenAIGenerator())
pipe.connect("retriever", "generator")

result = pipe.run({"retriever": {"query": "What is RAG?"}})

Vector Databases

ChromaDB

import chromadb
from chromadb.utils import embedding_functions

# Create client
client = chromadb.Client()

# Create collection với OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="your-key",
    model_name="text-embedding-3-small"
)

collection = client.create_collection(
    name="documents",
    embedding_function=openai_ef
)

# Add documents
collection.add(
    documents=["Document 1 content", "Document 2 content"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}],
    ids=["doc1", "doc2"]
)

# Query
results = collection.query(
    query_texts=["search query"],
    n_results=3
)

Pinecone

from pinecone import Pinecone

pc = Pinecone(api_key="your-key")
index = pc.Index("my-index")

# Upsert
index.upsert(vectors=[
    {"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"text": "..."}}
])

# Query
results = index.query(vector=[0.1, 0.2, ...], top_k=3, include_metadata=True)

So sánh Vector DBs

Database	Type	Best for	Pricing
ChromaDB	Embedded	Prototyping, small projects	Free
Pinecone	Managed	Production, scale	$$$
Weaviate	Self-hosted/Managed	Flexibility	Free/$$
Qdrant	Self-hosted	Performance	Free
pgvector	Postgres extension	Existing Postgres	Free

Model Inference

Transformers (Hugging Face)

from transformers import pipeline

# Text generation
generator = pipeline("text-generation", model="meta-llama/Llama-3-8B")
result = generator("Hello, I am", max_length=50)

# Embeddings
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(["Hello world", "How are you"])

# Classification
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")

vLLM – Fast Inference Server

# Start server
vllm serve meta-llama/Llama-3-8B --port 8000

# Client
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

response = client.chat.completions.create(
    model="meta-llama/Llama-3-8B",
    messages=[{"role": "user", "content": "Hello!"}]
)

Ollama – Local LLMs

# Install & run
ollama run llama3

# Python client
import ollama
response = ollama.chat(model='llama3', messages=[
    {'role': 'user', 'content': 'Hello!'}
])

Observability & Monitoring

LangSmith

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-key"

# Automatic tracing của LangChain calls

Weights & Biases

import wandb
wandb.init(project="llm-experiment")

# Log metrics
wandb.log({"accuracy": 0.95, "loss": 0.05})

# Log LLM calls
wandb.log({"prompt": prompt, "response": response, "tokens": token_count})

MLflow

import mlflow

with mlflow.start_run():
    mlflow.log_param("model", "gpt-4")
    mlflow.log_metric("latency", 1.5)
    mlflow.log_artifact("model_config.json")

Data Processing

unstructured – Document parsing

from unstructured.partition.pdf import partition_pdf
from unstructured.partition.docx import partition_docx

elements = partition_pdf("document.pdf")
for element in elements:
    print(type(element).__name__, element.text[:100])

tiktoken – Token counting

import tiktoken

enc = tiktoken.encoding_for_model("gpt-4")
tokens = enc.encode("Hello, world!")
print(f"Token count: {len(tokens)}")  # Token count: 4

Development Tools

Instructor – Structured outputs

import instructor
from pydantic import BaseModel
from openai import OpenAI

client = instructor.patch(OpenAI())

class User(BaseModel):
    name: str
    age: int
    email: str

user = client.chat.completions.create(
    model="gpt-4-turbo",
    response_model=User,
    messages=[{"role": "user", "content": "Extract: John is 30, email john@example.com"}]
)
# user.name = "John", user.age = 30, user.email = "john@example.com"

Guidance – Constrained generation

import guidance

program = guidance("""
The answer is {{select 'answer' options=['yes', 'no', 'maybe']}}
""")

result = program(llm=llm)

Recommended Stack cho 2024

Beginner Stack

OpenAI API + LangChain
ChromaDB (vector store)
Streamlit (UI)

Production Stack

LangChain + LlamaIndex
Pinecone hoặc Qdrant
LangSmith (monitoring)
FastAPI (backend)

Cost-optimized Stack

Ollama + Llama 3 (local)
ChromaDB (embedded)
Gradio (UI)

Fullstack Station Tips

Python AI stack 2024 đã mature đáng kể. Lời khuyên của mình:

Start with LangChain: Có learning curve nhưng đáng đầu tư
Prototype với ChromaDB: Sau đó migrate sang production DB
Local models với Ollama: Test miễn phí trước khi dùng API
Monitor từ đầu: LangSmith hoặc custom logging
Instructor cho structured output: Rất hữu ích cho data extraction

Fullstack Station

Python AI Stack 2024: Tổng kết công cụ và thư viện thiết yếu

LLM Orchestration Frameworks

LangChain

LlamaIndex

Haystack

Vector Databases

ChromaDB

Pinecone

So sánh Vector DBs

Model Inference

Transformers (Hugging Face)

vLLM – Fast Inference Server

Ollama – Local LLMs

Observability & Monitoring

LangSmith

Weights & Biases

MLflow

Data Processing

unstructured – Document parsing

tiktoken – Token counting

Development Tools

Instructor – Structured outputs

Guidance – Constrained generation

Recommended Stack cho 2024

Beginner Stack

Production Stack

Cost-optimized Stack

Fullstack Station Tips

Tham khảo

Comments

figonkingx

Leave A Comment Hủy

Python AI Stack 2024: Tổng kết công cụ và thư viện thiết yếu

LLM Orchestration Frameworks

LangChain

LlamaIndex

Haystack

Vector Databases

ChromaDB

Pinecone

So sánh Vector DBs

Model Inference

Transformers (Hugging Face)

vLLM – Fast Inference Server

Ollama – Local LLMs

Observability & Monitoring

LangSmith

Weights & Biases

MLflow

Data Processing

unstructured – Document parsing

tiktoken – Token counting

Development Tools

Instructor – Structured outputs

Guidance – Constrained generation

Recommended Stack cho 2024

Beginner Stack

Production Stack

Cost-optimized Stack

Fullstack Station Tips

Tham khảo

Comments

Bài liên quan:

figonkingx

Leave A Comment Hủy