Logo Crossweb

Logowanie

close
Zarejestruj się Zapomniałem hasła

Przypominanie hasła

close Wpisz adres mailowy na jaki masz założone konto w Crossweb.
Wyślij

GenAI Cracow #28 - Local LLMs, Multimodality

genai-cracow-28-local-llms-multimodality
Wydarzenie:
GenAI Cracow #28 - Local LLMs, Multimodality
Typ wydarzenia:
Spotkanie
Kategoria:
IT
Tematyka:
Data:
29.06.2026 (poniedziałek)
Godzina:
18:00
Język:
angielski
Wstęp:
Bezpłatne
Miasto:
Miejsce:
Active Campaign
Adres:
CFP:
do 26.06.2026 (piątek)
Rejestracja:
Agenda:
  • Opening Ceremony
  • Native Multimodality: Beyond Language-Centric Multimodal Models by Jakub
  • TBD - CFP is active
  • Q&A
  • Networking with pizza and beer
Opis:

Abstract

Native Multimodality: Beyond Language-Centric Multimodal Models

Traditional multimodal pipelines compromise performance by forcing audio and visual data into a compressed text-token space, permanently losing spatial structure, temporal flow, and fine-grained detail. To resolve this bottleneck, cutting-edge systems—including Kimi K2.5, SenseTime’s NEO/NEO-unify architecture, and the Gemini 1.5+ series—have converged on native multimodality, retaining raw structural context or eliminating the encoder-projector pipeline entirely. While native architectures drastically reduce data requirements and make cross-modal reasoning less brittle, they also introduce complex training dynamics and shift failure modes rather than eliminating them. Drawing on six months of empirical research, this talk evaluates where native multimodality fundamentally alters performance, outlines its persistent failure modes, and analyzes the emerging scaling behaviors defining the next generation of AI.


​Speakers

Jakub Strawa

AI Researcher and Research Engineer specializing in LLM training, post-training, and multimodal models, bridging the gap between cutting-edge research and scalable engineering. Currently at Stonly, I focus on developing, training, and rigorously evaluating AI agents. My background includes building enterprise-grade applications for Fortune 500 companies, conducting R&D at Roche and Raiffeisen Bank, and working on multimodal reasoning at TCL, where I collaborated directly with top-tier researchers in China and the Qwen team.

Widzisz błąd w opisie lub danych wydarzenia?

Podobne wydarzenia