parameterize the image encoder as f
i
q
_{iq}
iq?
query feature q
i
i
_{ii}
ii?,key feature k
i
i
_{ii}
ii?
parameterize the textual encoder as
f
c
q
(
?
;
Θ
q
,
Φ
c
q
)
f_{cq}(·; Θ_q, Φ_{cq})
fcq?(?;Θq?,Φcq?),momentum textual encoder as
f
c
k
(
?
;
Θ
k
,
Φ
i
k
)
f_{ck}(·; Θ_k, Φ_{ik})
fck?(?;Θk?,Φik?).
c
j
?
c^?_j
cj??和
c
j
?
c^star_j
cj??是different augmented examples
吐槽
第一张图字母下标被黑色背景盖住了,且作者不公布代码,不该是CVPR的“水平”