Seminar on Current Works in Computer Vision
Prof. Thomas BroxThe goal of Computer Vision is to imitate the flexibility and robustness of the human visual system and has a large impact on the success of artificial intelligence in general. Research has made significant progress in recent years particularly due to deep learning. Meanwhile computer vision is interweaved with other scopes of machine learning. In this seminar we will take a detailed look at various recent papers that span a broad range of hot topics. This includes particularly the emergence of properties from self-supervised learning and their analysis.
For each paper there will be three persons, who perform a more detailed investigation of the research paper and its background, and who will give a presentation. The presentation is followed by a discussion with all participants about the merits, limitations, and perspectives of the respective paper. You will learn to read and understand contemporary research papers, to give a good oral presentation, to ask questions, and to openly discuss a research problem.
Note that the mode of the seminar changes this semester to accomodate more slots for students. Rather than one student presenting a paper, three students will cover three different aspects, typically (but not necessarily) (1) the paper's historical background, related work, and motivation, (2) the core methodology including background methodology, and (3) experimental results and their assessment. In order to provide a strong overall presentation, you will also learn to work in a team, the assembly of which is out of your control. The success of the team effort is part of the grading.
|
|
Material:
Giving a good presentationProper scientific behavior
Powerpoint template for your presentation (optional)
Papers:
| Date | Paper | Presenting students | Slides | Advisor |
| S1 | V-JEPA | |||
| S2 | Self-Flow | |||
| S3 | Block-recurrent transformers | |||
| S4 | Video generation with internal memory | |||
| S5 | Reasoning with mental imagary | |||
| S6 | Multimodal world models | |||
| S7 | In-context algebra | |||

