Berlin DataTalksClub Group cover photo

Part of DataTalks.Club Network - 2 groups

Berlin DataTalksClub Group

4.3•596 ratings

Share

About us

A group for experienced and aspiring data professionals.
Join our Slack: https://datatalks.club/slack.html

Upcoming events

7

Build Your First RAG Application with LLMs
Mon, May 11 · 12:00 PM CEST
·
Online
Online
This is the 1st workshop in our series to update the LLM Zoomcamp content.

This workshop updates Module 1: Introduction to LLMs and RAG.

In this hands-on session, Alexey Grigorev will show how to build a basic Retrieval-Augmented Generation pipeline for answering questions about course FAQ documents.

You’ll index FAQ documents from the Zoomcamp courses, retrieve relevant entries, and use the OpenAI API to generate answers based on the retrieved context.

What you’ll learn:
- What LLMs are and how they are used in question-answering systems
- What Retrieval-Augmented Generation is and why it’s useful
- How a basic RAG architecture works
- How to prepare a Python environment for an LLM application
- How to index FAQ documents from Zoomcamp courses
- How to implement keyword search with MinSearch
- How to build prompts with retrieved context
- How to generate answers with the OpenAI API
- How to refactor the RAG pipeline into modular code
- How to replace MinSearch with Elasticsearch for a more realistic retrieval setup
- How to run Elasticsearch with Docker and search indexed documents
By the end, you’ll have a working RAG pipeline that answers questions using FAQ documents from Data Engineering Zoomcamp, Machine Learning Zoomcamp, and MLOps Zoomcamp.
Like the other workshops, this will be a live demo with practical tips and time for Q&A.

***

All events in these series:
***

## Thinking about Joining LLM Zoomcamp?

This workshop covers the updated content for Module 1 of the LLM Zoomcamp, our free course on building practical LLM applications with RAG, vector search, evaluation, monitoring, and AI agents.
You start with a simple RAG pipeline, then improve it with better retrieval, semantic search, function calling, evaluation, monitoring, and production practices.
The course covers the full lifecycle of an LLM application: from the first working prototype to evaluation, monitoring, and a complete final project.
The new cohort of LLM Zoomcamp starts on June 8, 2026. You can join it by registering here.

##

## About the Speaker

Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series.
Alexey is a software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books, including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS’17 Criteo Challenge.

Join our Slack: https://datatalks.club/slack.html
76 attendees
From Notebook to Production: Building End-to-End AI Systems
Tue, May 12 · 12:30 PM CEST
·
Online
Online
In this episode, we talk with Mariano Semelman about what it takes to build end-to-end AI solutions in production, especially in messy, high-scale e-commerce environments where a good model alone is not enough.
Mariano is a Lead Data Scientist and Machine Learning Engineer with more than 10 years of experience in e-commerce. At OLX, he works on media solutions for sellers, including systems that turn a user’s product video into a structured marketplace ad by automatically selecting images and extracting relevant details.

We’ll discuss:

Why data scientists should learn to operate their own models

How tools like Claude Code and Cursor can help bridge the gap between research and production

What it takes to build video-to-ad systems that are reliable and scalable

How to think about extensibility, monitoring, and maintainability in AI products

When to prioritize customer impact first and pay down tech debt later

About the Speaker:
Mariano Semelman is a Lead Data Scientist and Machine Learning Engineer with over 10 years of experience in e-commerce. He focuses on building end-to-end solutions and explores how agentic tools can make the path from research to production more seamless. At OLX, he works on AI-powered media solutions that simplify the listing process for sellers.

Join our Slack: https://datatalks.club/slack.html
33 attendees
From RAG to AI Agents: Function Calling and Tool Use
Mon, May 18 · 12:00 PM CEST
·
Online
Online
This is the 2nd workshop in our series to update the LLM Zoomcamp content.

This workshop updates Module 2: Introduction to Agents.

In this hands-on session, Alexey Grigorev will show how to turn a basic RAG application into an agentic AI assistant.

You’ll start with a simple RAG pipeline over Zoomcamp FAQ documents, then add agentic behavior: search decisions, multiple tool calls, function calling, and structured interaction with external tools.

What you’ll learn:
- How to build a basic RAG application over course FAQ documents
- What makes a RAG flow “agentic”
- How agents decide when to search and when to answer
- How to make the LLM generate search queries based on the user question
- How to run agentic search over multiple iterations
- How to use previous actions and search history as context
- How OpenAI function calling works
- How to define tools that an LLM can call
- How to let an assistant search the FAQ database using tools
- How to add multiple tools, including a tool for adding new FAQ entries
- How to structure the assistant logic into reusable components
- How to use PydanticAI to simplify tool definitions and agent implementation
By the end, you’ll have an agentic course assistant that can search the FAQ database, decide when it needs more information, call tools, and answer student questions with context.

Like the other workshops, this will be a live demo with practical tips and time for Q&A.

***

All events in these series:
***

## Thinking about Joining LLM Zoomcamp?

This workshop covers the updated content for Module 2 of the LLM Zoomcamp, our free course on building practical LLM applications with RAG, vector search, evaluation, monitoring, and AI agents.
You start with a simple RAG pipeline, then improve it with better retrieval, semantic search, function calling, evaluation, monitoring, and production practices.
The course covers the full lifecycle of an LLM application: from the first working prototype to evaluation, monitoring, and a complete final project.
The new cohort of LLM Zoomcamp starts on June 8, 2026. You can join it by registering here.

##

## About the Speaker

Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series.

Alexey is a software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books, including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS’17 Criteo Challenge.

Join our Slack: https://datatalks.club/slack.html
62 attendees
Vector Databases: Embeddings, Semantic Search, and Hybrid Retrieval
Thu, May 21 · 12:00 PM CEST
·
Online
Online
This is the 3rd workshop in our series to update the LLM Zoomcamp content.

This workshop updates Module 3: Vector Search.

In this hands-on session, Alexey Grigorev will show how to add semantic search to a RAG application using embeddings and a vector database.

You’ll learn how to turn text into embeddings, index them, search for semantically similar documents, and use the results as context for an LLM.

What you’ll learn:
- What vector search is and how it differs from keyword search
- What embeddings are and how they represent text
- How to embed FAQ documents for semantic retrieval
- How to index text data in a vector database
- How to run semantic search over indexed documents
- How to use vector search inside a RAG pipeline
- How to compare vector search with keyword search
- How hybrid search combines semantic and keyword retrieval
- When vector search works well and where it can fail
- How retrieval quality affects the final LLM answer
By the end, you’ll have a RAG pipeline that uses vector search to retrieve semantically relevant documents and generate answers based on them.

Like the other workshops, this will be a live demo with practical tips and time for Q&A.

***

All events in these series:
***

## Thinking about Joining LLM Zoomcamp?

This workshop covers the updated content for Module 3 of the LLM Zoomcamp, our free course on building practical LLM applications with RAG, vector search, evaluation, monitoring, and AI agents.
You start with a simple RAG pipeline, then improve it with better retrieval, semantic search, function calling, evaluation, monitoring, and production practices.
The course covers the full lifecycle of an LLM application: from the first working prototype to evaluation, monitoring, and a complete final project.
The new cohort of LLM Zoomcamp starts on June 8, 2026. You can join it by registering here.

##

## About the Speaker

Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series.

Alexey is a software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books, including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS’17 Criteo Challenge.

Join our Slack: https://datatalks.club/slack.html
51 attendees

Past events

381

Organizers

DataTalks.Club Events

Super Organizer

Members

7,929

Related topics

Machine Learning

Data Engineering