DLvis: Artículos para el trabajo final

Estimados,

Va una lista de 14 artículos que cubren una buena parte de los temas del curso. Cada artículo tiene asociado un link al pdf, y uno a más links a implementaciones en caso de que quieran correr algo ustedes mismos (no es requisito pero obviamente aporta a su entendimiento).

La idea es que se vayan viendo cuáles les interesan. El jueves 28/11 a las 18h se abre una actividad EVA para que registren su selección. Cada uno de los artículos podrá ser elegido por a lo sumo dos personas, y estos cupos se irán agotando a medida que ustedes los vayan eligiendo. Por esta razón, sería bueno que se armen una lista de preferencia de tres o cuatro artículos antes de ingresar al sistema, por si el artículo que querían ya no está disponible. Tienen hasta el 1/12 para elegir el artículo.

Como les dijimos, si tienen en mente otro artículo que quisieran presentar, les pedimos que nos lo manden por mail a más tardar el jueves 21/11 para inspeccionarlo. Para que accedamos a autorizarlo, el artículo tiene que cumplir:

1. Estar dentro de las temas del curso
2. Ser de una complejidad acorde al curso (esto es lo que vamos a chequear nosotros).
3. Haber sido publicado en una revista o conferencia de prestigio del área. A modo de ejemplo:

- Conferencias: CVPR, ICCV, ECCV, WACV, Neurips, ICML, ICLR, etc.
- Revistas: IEEE PAMI, IEEE TCI, IEEE TIP, JMLR, IJCV, ACM TOG, Nature Machine Intelligence, etc.

Les recordamos que el trabajo es individual.

Cualquier consulta nos escriben.

Saludos,

Pablo

------------------------------------------------------

Nerf (Neural Radiance Fields) y Gaussian Splatting

1 .Nerf

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.

https://dl.acm.org/doi/abs/10.1145/3503250

https://github.com/bmild/nerf

https://github.com/facebookresearch/pytorch3d/tree/main/projects/nerf

2. Gaussian Splatting

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

Segmentación

3. Segment anything

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., ... & Girshick, R. (2023). Segment anything. ICCV 2023.

https://openaccess.thecvf.com/content/ICCV2023/papers/Kirillov_Segment_Anything_ICCV_2023_paper.pdf

https://github.com/facebookresearch/segment-anything

Transfomers

4. DETR (Transformer para detección y reconocimiento de objetos)

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing.

https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460205.pdf

https://github.com/facebookresearch/detr

https://huggingface.co/docs/transformers/v4.35.2/en/model_doc/detr

5. DINO (Vision Transformer with self-supervised learning)

Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., & Joulin, A. (2021). Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9650-9660).

https://openaccess.thecvf.com/content/ICCV2021/papers/Caron_Emerging_Properties_in_Self-Supervised_Vision_Transformers_ICCV_2021_paper.pdf

https://github.com/facebookresearch/dino

6. LORA

Hu, E. J., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations.

https://openreview.net/forum?id=nZeVKeeFYf9

https://github.com/microsoft/LoRA

VAEs

7. VQ-VAE

Razavi, A., Van den Oord, A., & Vinyals, O. (2019). Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32.

https://papers.nips.cc/paper_files/paper/2019/file/5f8e2fa1718d1bbcadf1cd9c7a54fb8c-Paper.pdf

https://github.com/rosinality/vq-vae-2-pytorch

https://github.com/google-deepmind/sonnet

8. Masked Autoencoders

He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16000-16009).

https://openaccess.thecvf.com/content/CVPR2022/papers/He_Masked_Autoencoders_Are_Scalable_Vision_Learners_CVPR_2022_paper.pdf

https://github.com/facebookresearch/mae

Normalizing Flows

9. Glow

Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31.

https://proceedings.neurips.cc/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf

https://github.com/openai/glow

GANs

10. StyleGan

Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4401-4410).

https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf

https://github.com/NVlabs/stylegan

Diffusion Models

11. Denoising diffusion probabilistic models

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

https://github.com/hojonathanho/diffusion

Restauración de imágenes con métodos plug & play

12. Kadkhodaie, Z., & Simoncelli, E. P. Solving Linear Inverse Problems Using the Prior Implicit in a Denoiser. In NeurIPS 2020 Workshop on Deep Learning and Inverse Problems.

https://www.cns.nyu.edu/pub/eero/kadkhodaie20a-arxiv2007.13640-v2.pdf

https://github.com/LabForComputationalVision/universal_inverse_problem

Teoría: Sobreparametrización y memoria asociativa en redes neuronales

13. A. Radhakrishnan, M. Belkin, C. Uhler, Overparameterized neural networks implement associative memory, Proc. Natl. Acad. Sci. U.S.A. 117 (44) 27162-27170, https://doi.org/10.1073/pnas.2005013117 (2020).

https://www.pnas.org/doi/epdf/10.1073/pnas.2005013117

xLSTM (extended LSTM)

14. Pöppel, K., Beck, M., Spanring, M., Auer, A., Prudnikova, O., Kopp, M. K., ... & Hochreiter, S. xLSTM: Extended Long Short-Term Memory. In First Workshop on Long-Context Foundation Models@ ICML 2024.

https://openreview.net/pdf?id=Dh0Y88UAXR

https://github.com/NX-AI/xlstm

https://github.com/AI-Guru/xlstm-resources?tab=readme-ov-file