Diffusion model ?! Generative AI

2 min readApr 30, 2023

Regarding generative learning, There are sentences, images, or voices.

below we will quickly go through two main concepts in generative AI, Autoregressive(AR) model and Non-autoregressive model(NAR)

autoregressive model:

each time only working on one unit, it could be one word, one pixel.
slower, due to each word/sentence is generated in sequence
better quality.
working in word/sentence domain

non-autoregressive model:

Setting the output fixed size, i.e. 200 words or 1000 pixels, then produce the results at once.
faster, if computation is parallel
working in image domain.

Can we use both (AR+ NAR) at the same time? yes

in word/sentence domain, if only using AR it takes ages to finish. The solution is to use AR first to generate the temporary products quickly, then using NAR to generate fine results.

in image domain, we use NAR repeatedly, like we loop NAR a few times to give it more accuracy. In this way we generate vague results at initial runs, then it becomes more and more accurate in the later runs when we fine-tune it. Yes, it is also the basic concept of “Diffusion model”.

Images ref: https://www.youtube.com/watch?v=AihBniegMKg&list=PLJV_el3uVTsOePyfmkfivYZ7Rqr2nMk3W&index=6&t=122s

Diffusion model ?! Generative AI

Written by George S

No responses yet