Logo Crossweb

Log in

close
Sign up Forgot password

Password recovery

close Enter the email address you used to register your Crossweb account.
Send

GenAI Cracow #28 - Local LLMs, Multimodality

genai-cracow-28-local-llms-multimodality
Event:
GenAI Cracow #28 - Local LLMs, Multimodality
Event type:
Meetup
Category:
IT
Topic:
Date:
29.06.2026 (monday)
Time:
18:00
Language:
English
Price:
Free
City:
Place:
Active Campaign
Address:
CFP:
do 26.06.2026 (friday)
Registration:
Agenda:
  • Opening Ceremony
  • Native Multimodality: Beyond Language-Centric Multimodal Models by Jakub
  • TBD - CFP is active
  • Q&A
  • Networking with pizza and beer
Description:

Abstract

Native Multimodality: Beyond Language-Centric Multimodal Models

Traditional multimodal pipelines compromise performance by forcing audio and visual data into a compressed text-token space, permanently losing spatial structure, temporal flow, and fine-grained detail. To resolve this bottleneck, cutting-edge systems—including Kimi K2.5, SenseTime’s NEO/NEO-unify architecture, and the Gemini 1.5+ series—have converged on native multimodality, retaining raw structural context or eliminating the encoder-projector pipeline entirely. While native architectures drastically reduce data requirements and make cross-modal reasoning less brittle, they also introduce complex training dynamics and shift failure modes rather than eliminating them. Drawing on six months of empirical research, this talk evaluates where native multimodality fundamentally alters performance, outlines its persistent failure modes, and analyzes the emerging scaling behaviors defining the next generation of AI.


​Speakers

Jakub Strawa

AI Researcher and Research Engineer specializing in LLM training, post-training, and multimodal models, bridging the gap between cutting-edge research and scalable engineering. Currently at Stonly, I focus on developing, training, and rigorously evaluating AI agents. My background includes building enterprise-grade applications for Fortune 500 companies, conducting R&D at Roche and Raiffeisen Bank, and working on multimodal reasoning at TCL, where I collaborated directly with top-tier researchers in China and the Qwen team.

See an error in the description or event details?

Similar events