DLvis: Artículos para presentación

Estimados,

Va una lista de 15 artículos que cubren una buena parte de los temas del curso. Cada artículo tiene asociado un link con al pdf del artículo, y uno a más links a implementaciones en caso de que quieran correr algo ustedes mismos (no es requisito pero obviamente aporta).

La idea es que se vayan viendo cuáles les interesan. El lunes 20 a las 18h se abre una actividad EVA para que registren su selección. Cada uno de los artículos podrá ser elegido por a lo sumo tres personas, y estos cupos se irán agotando a medida que ustedes los vayan eligiendo. Por esta razón, sería bueno que se armen una lista de preferencia de tres o cuatro artículos antes de ingresar al sistema, por si el artículo que querían ya no está disponible.

Como les dijimos, si tienen en mente otro artículo que quisieran presentar, les pedimos que nos lo manden por mail el lunes de mañana para inspeccionarlo. Para que accedamos a autorizarlo, el artículo tiene que cumplir:

Estar dentro de las temas del curso
Haber sido publicado en una revista o conferencia de prestigio del área. A modo de ejemplo:

Conferencias: CVPR, ICCV, ECCV, WACV, Neurips, ICML, ICLR, ...
Revistas: IEEE PAMI, IJCV, ACM TOG, Nature Machine Intelligence

Ser de una complejidad acorde al curso (esto es lo que vamos a chequear nosotros).

Les recordamos que el trabajo es individual.

Cualquier consulta nos escriben.

Saludos,

Pablo

------------------------------------------------------

Nerf (Neural Radiance Fields)

1 .Nerf

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.

https://dl.acm.org/doi/abs/10.1145/3503250

https://github.com/bmild/nerf

https://github.com/facebookresearch/pytorch3d/tree/main/projects/nerf

Segmentación

2. Segment anything

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., ... & Girshick, R. (2023). Segment anything. ICCV 2023.

https://openaccess.thecvf.com/content/ICCV2023/papers/Kirillov_Segment_Anything_ICCV_2023_paper.pdf

https://github.com/facebookresearch/segment-anything

Detección de objetos

3. YOLOv7

Wang, C. Y., Bochkovskiy, A., & Liao, H. Y. M. (2023). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7464-7475).
https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.pdf

https://github.com/WongKinYiu/yolov7

Vision Transfomers

4. DETR

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing.

https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460205.pdf

https://github.com/facebookresearch/detr

https://huggingface.co/docs/transformers/v4.35.2/en/model_doc/detr

5. ViT

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020, October). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations.

https://openreview.net/pdf?id=YicbFdNTTy

https://huggingface.co/docs/transformers/model_doc/vit

https://github.com/google-research/vision_transformer

VAEs

6. NVAE

Vahdat, A., & Kautz, J. (2020). NVAE: A deep hierarchical variational autoencoder. Advances in neural information processing systems, 33, 19667-19679.

https://proceedings.neurips.cc/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf

https://github.com/NVlabs/NVAE

7. VQ-VAE

Razavi, A., Van den Oord, A., & Vinyals, O. (2019). Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32.

https://papers.nips.cc/paper_files/paper/2019/file/5f8e2fa1718d1bbcadf1cd9c7a54fb8c-Paper.pdf

https://github.com/rosinality/vq-vae-2-pytorch

https://github.com/google-deepmind/sonnet

8. Masked Autoencoders

He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16000-16009).

https://openaccess.thecvf.com/content/CVPR2022/papers/He_Masked_Autoencoders_Are_Scalable_Vision_Learners_CVPR_2022_paper.pdf

https://github.com/facebookresearch/mae

Normalizing Flows

9. Glow

Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31.

https://proceedings.neurips.cc/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf

https://github.com/openai/glow

10. SRFlow

Lugmayr, A., Danelljan, M., Van Gool, L., & Timofte, R. (2020). Learning the super-resolution space with normalizing flow. ECCV, Srflow.

https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123500698.pdf

https://github.com/andreas128/SRFlow

GANs

11. StyleGan

Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4401-4410).

https://openaccess.thecvf.com/content_CVPR_2019/papers/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.pdf

https://github.com/NVlabs/stylegan

12. StyleGan 3
Karras, T., Aittala, M., Laine, S., Härkönen, E., Hellsten, J., Lehtinen, J., & Aila, T. (2021). Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34, 852-863.

https://proceedings.neurips.cc/paper_files/paper/2021/file/076ccd93ad68be51f23707988e934906-Paper.pdf

https://github.com/NVlabs/stylegan3

Diffusion Models

13. Deep unsupervised learning using nonequilibrium thermodynamics

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015, June). Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning (pp. 2256-2265). PMLR.

http://proceedings.mlr.press/v37/sohl-dickstein15.pdf

https://github.com/Sohl-Dickstein/Diffusion-Probabilistic-Models

14. Denoising diffusion probabilistic models

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

https://github.com/hojonathanho/diffusion

15. ControlNet

Zhang, L., Rao, A., & Agrawala, M. (2023). Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3836-3847).

https://openaccess.thecvf.com/content/ICCV2023/papers/Zhang_Adding_Conditional_Control_to_Text-to-Image_Diffusion_Models_ICCV_2023_paper.pdf

https://github.com/lllyasviel/ControlNet

Re: Artículos para presentación

de Camilo Joaquin Mariño Cabrera - lunes, 20 de noviembre de 2023, 17:48

Estimados,

En https://eva.fing.edu.uy/mod/choice/view.php?id=198992 pueden elegir el paper que quieran, recuerden que existe un límite de 3 personas por artículo.

Saludos,

Camilo