




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
Adversarial
Defense姜育剛,馬興軍,吳祖煊Recap:
week
4
1.
Adversarial
Example
DetectionSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Adversarial
Attack
CompetitionLink:https://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
Adversarial
Defense
vs
DetectionThe
weird
relationship
between
defense
and
detectionDetection
IS
defenseBut…
when
we
say
defense,
we
(most
of
the
time)
mean
the
model
is
secured,
yet
detection
cannot
do
that…In
survey
papers:
detection
is
defenseIn
technical
papers:
defense
is
defense
not
detectionDifferencesDefense
is
to
secure
the
model
or
the
systemDetection
is
to
identify
potential
threats,
which
should
befollowed
by
a
defense
strategy,
e.g.,
query
rejection
(but
mostly
ignored)By
defense,
it
mostly
means
robust
training
methodsDefense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsLater
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressA
Recap
of
the
Timeline2014年Goodfellow等人提出快速單步攻擊FGSM及對抗訓(xùn)練2015年簡單檢測方法(PCA)和對抗訓(xùn)練方法2016年提出對抗訓(xùn)練的min-max優(yōu)化框架2017年大量的對抗樣本檢測方法和攻擊方法(BIM、C&W)、10種檢測方法被攻破2019年TRADES及大量其他對抗訓(xùn)練方法、第一篇Science文章2018年物理世界攻擊方法、檢測方法升級、PGD攻擊與對抗訓(xùn)練、9種防御方法被攻破2020年AutoAttack攻擊、Fast對抗訓(xùn)練2021年增大模型、增加數(shù)據(jù)的對抗訓(xùn)練、領(lǐng)域延伸2022年尚未解決的問題,攻擊越來越多,防御越來越難2013年Biggio等人與Szegedy等人發(fā)現(xiàn)對抗樣本Principles
of
DefenseBlock
the
attack
(掐頭去尾)Mask
the
input
gradientsRegularize
the
input
gradientsDistill
the
logitsDenoise
the
inputRobustify
the
model
(增強(qiáng)中間)Smooth
the
decision
boundaryReduce
the
Lipschitzness
of
the
modelSmooth
the
loss
landscapeAdversarial
AttackSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.
模型訓(xùn)練:
對抗攻擊:
分類錯(cuò)誤擾動(dòng)很小測試階段攻擊
Performance
Metrics
Othermetrics:maximumperturbationfor100%attacksuccessrate
Defense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsAdvanced
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressDefensive
DistillationMaking
large
logits
change
to
be
”small”Scaling
up
logits
by
a
few
magnitudes;Retrain
the
lastlayerwith
scaled
logits;Papernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
12Distillation
with
temperature
TDefensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
Defensive
Distillation
Is
Not
RobustCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).
It
can
be
evaded
by
attacking
the
distilled
network
with
the
temperature
T.
Lessons
LearnedCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).Distillation
is
not
a
good
solution
for
adversarial
robustnessVanishing
input
gradients
can
still
be
recovered
by
a
reverse
operationA
defense
should
beevaluated
against
the
adaptive
attack
to
prove
real
robustnessInput
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Drucker,Harris,andYannLeCun.“Improvinggeneralizationperformanceusingdoublebackpropagation.”
TNN,1992.
Classification
lossInput
gradients
regularization
Related
to
the
double
backpropagation
proposed
by
DruckerandLeCun(1992):Input
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Issues:
1)
limited
adversarial
robustness,
2)
hurts
learning蒸餾的對抗訓(xùn)練的正則化的Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
NDSS,2018.Compress
the
input
spaceIt
also
hurts
performance
on
large-scale
image
datasets.ThermometerEncodingBuckmanetal."Thermometerencoding:Onehotwaytoresistadversarialexamples."
ICLR,2018.Discretize
the
input
to
break
small
noiseProposed
Thermometer
EncodingInput
TransformationsGuoetal."CounteringAdversarialImagesusingInputTransformations."
ICLR,2018.ImagecroppingandrescalingBit-depthreductionJPEGcompressionTotal
variance
minimizationImagequiltingObfuscated
Gradients
=
Fake
RobustnessAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.BackwardPassDifferentiableApproximation(BPDA):
Expectation
Over
Transformation
(EOT)T:
a
set
of
randomized
transformationscan
break
non-differentiable
operation
based
defensescan
break
randomization
based
defensesfind
a
linear
approximation
of
the
non-differentiable
operations,
e.g.,
discretization,
compression
etc.
BPDA+EOT
breaks
7
defenses
published
at
ICLR
2018Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!How
to
Properly
Evaluate
a
Defense?Carlini,Nicholas,etal.“Onevaluatingadversarialrobustness.”
arXivpreprintarXiv:1902.06705
(2019).
Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018Donotblindlyapplymultiple(similar)attacksTryatleastonegradient-freeattackandonehard-labelattackPerformatransferabilityattackusingasimilarsubstitutemodel.Forrandomizeddefenses,properlyensembleoverrandomnessFornon-differentiablecomponents,applydifferentiabletechniques
(BPDA)VerifythattheattackshaveconvergedundertheselectedhyperparametersCarefullyinvestigateattackhyperparametersandreportthoseselectedCompareagainstpriorworkandexplainimportantdifferencesTestbroaderthreatmodelswhenproposinggeneraldefensesRobust
Activation
FunctionsXiao
et
al."EnhancingAdversarialDefensebyk-Winners-Take-All."
ICLR,2020.Block
the
internal
activation:
break
the
continuityk-Winners-Take-All(k-WTA)activationRobust
Loss
FunctionPang
et
al.Rethinkingsoftmaxcross-entropylossforadversarialrobustness.ICLR,2020.Max-Mahalanobiscenter(MMC)
lossMax-Mahalanobiscenter(MMC);
SCE:
softmax
cross
entropyRobust
InferencePang
et
al.MixupInference:Betterexploitingmixuptodefendadversarialattacks.ICLR,2020.Mixup
Inference
(MI)New
Adaptive
Attacks
Break
These
DefensesTrameretal.“Onadaptiveattackstoadversarialexampledefenses.”
NeurIPS,
2020.T1:
AttackthefulldefenseT2:
Target
importantdefensepartsT3:
SimplifytheattackT4:
EnsureconsistentlossfunctionT5:
OptimizewithdifferentmethodsT6:
UsestrongadaptiveattacksHow
to
Evaluate
aDefense?CroceandHein.“Reliableevaluationofadversarialrobustnesswithanensembleofdiverseparameter-freeattacks.”
ICML,2020.
Gao
et
al.
FastandReliableEvaluationofAdversarialRobustnesswithMinimum-MarginAttack,
ICML
2022.Zimmermann
etal."IncreasingConfidenceinAdversarialRobustnessEvaluations."
arXivpreprintarXiv:2206.13991
(2022).Strong
attacks:AutoAttack
(one
must-to-test
attack)Margin
Decomposition
(MD)
attack
(better
than
AutoAttack
on
ViT)Minimum-Margin
(MM)
attack
(new
SOTA
attack
to
test?)Extra
robustness
tests
Attackunittests
(Zimmermannetal,2022)Adversarial
TrainingGoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.The
idea
is
simple:
just
train
on
adversarial
examples!
對抗訓(xùn)練是一種數(shù)據(jù)增廣方法原始數(shù)據(jù)->對抗攻擊->對抗樣本->模型訓(xùn)練Adversarial
Training
Adversarial
TrainingAdversarial
training
produces
smooth
decision
boundary正常邊界生成對抗樣本訓(xùn)練后Early
Adversarial
Training
MethodsSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.2014年,Szegedy
et
al.
在其解釋對抗樣本的論文中已探索了對抗訓(xùn)練,用L-BFGS攻擊對神經(jīng)網(wǎng)絡(luò)每一層生成對抗樣本,并添加到訓(xùn)練過程中。發(fā)現(xiàn):深層對抗樣本更能提高魯棒性2015年,Goodfellow
et
al.提出使用FGSM(單步)攻擊生成的對抗樣本來訓(xùn)練神經(jīng)網(wǎng)絡(luò)Goodfellow等人并未使用中間層的對抗樣本,因?yàn)榘l(fā)現(xiàn)中間層對抗樣本沒有提升Min-max
Robust
OptimizationNoklandetal.Improvingback-propagationbyaddinganadversarialgradient.
arXiv:1510.04189,
2015.Huang
et
al.
Learningwithastrongadversary,
ICLR
2016.
Shaham
et
al.
Understandingadversarialtraining:Increasinglocalstabilityofneuralnetsthroughrobustoptimization,
arXiv:1511.05432,
2015The
First
Proposal
of
Min-Max
Optimization內(nèi)部最大化Inner
maximization外部最小化Outer
minimization
Virtual
Adversarial
TrainingMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.VAT:
a
method
to
improve
generalizationDifferences
to
adversarial
trainingL2
regularized
perturbationUse
both
clean
and
adv
examples
for
trainingUse
KL
divergence
to
generate
adv
examplesWeaknesses
of
Early
AT
MethodsMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.
These
methods
are
fast!
Only
takes
x2
time
of
standard
trainingPGD
Adversarial
TrainingAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.A
Saddle
Point
Problem內(nèi)部最大化Inner
maximization外部最小化Outer
minimizationA
saddle
point(constrained
bi-level
optimization)
problemIn
constrained
optimization,
Projected
Gradient
Descent
(PGD)
is
the
best
first-order
solver
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)PGD
is
an
optimizerPGD
is
also
known
as
an
adv
attack
in
the
field
of
AML
Projection(Clipping)
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)Random
initialization
+
Uniform
Noise
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.Characteristics
of
PGD
adversarial
training
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.決策邊界:魯棒特征:普通訓(xùn)練對抗訓(xùn)練普通訓(xùn)練對抗訓(xùn)練對抗樣本Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
PGD步長對魯棒性影響PGD步數(shù)對魯棒性影響訓(xùn)練初期使用簡單攻擊Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
How
to
measure
the
convergence
of
the
inner
maximization?
Definition(
First-OrderStationaryCondition
(FOSC))
Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
Dynamic
Adversarial
Training:Weak
attack
for
early
training,
strong
attack
for
later
trainingWeak
attack
improves
generalization,
strong
attack
improves
final
robustness.Convergence
analysis:DART
improves
robustnessRobustness
on
CIFAR-10
with
WideResNetTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Use
distribution
loss
(KL)
for
inner
and
outer
optimizationsTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Winning
solutions
of
NeurIPS2018AdversarialVisionChallengeCharacteristics
of
TRADES使用KL監(jiān)督對抗樣本的生成,魯棒性提升顯著干凈樣本也參與訓(xùn)練,有利于模型收斂和干凈準(zhǔn)確率基于KL的對抗樣本生成包含自適應(yīng)的過程能成訓(xùn)練得到比PGD對抗訓(xùn)練更平滑的決策邊界TRADES既改進(jìn)了內(nèi)部最大化又改進(jìn)了外部最小化TRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Experimental
results
of
TRADESTRADES
vs
VAT
vs
ALPZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Miyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.Kannan,Harini,AlexeyKurakin,andIanGoodfellow."Adversariallogitpairing."
arXivpreprintarXiv:1803.06373
(2018).TRADES:Virtual
Adversarial
Training:Adversarial
Logits
Pairing:相似的優(yōu)化框架,不同的損失選擇,結(jié)果差異很大MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.Adversarialexamplesareonlydefinedoncorrectlyclassifiedexamples,
what
about
misclassified
examples
?
MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.The
influence
of
misclassifiedandcorrectlyclassifiedexamples:
Misclassifiedexampleshaveasignificantimpactonthefinalrobustness!MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.
differentmaximizationtechniques
have
negligibleeffectdifferentminimizationtechniques
have
significanteffectMART:
MisclassificationAwareadveRsarialTrainingMisclassificationawareadversarialrisk:Adversarial
risk:Correctlyclassified
and
misclassifiedexample:Misclassificationawareadversarialrisk:MART:
MisclassificationAwareadveRsarialTrainingSurrogatelossfunctions(existingmethodsandMART):Semi-supervisedextensionofMART:MART:
MisclassificationAwareadveRsarialTrainingRobustness
of
MART:White-boxrobustness:ResNet-18,CIFAR-10,??=8/255White-boxrobustness:WideResNet-34-10,CIFAR-10,??=8/255Using
More
Data
to
Improve
RobustnessAlayraceta
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年植入式廣告行業(yè)當(dāng)前競爭格局與未來發(fā)展趨勢分析報(bào)告
- 收徒基礎(chǔ)知識培訓(xùn)內(nèi)容課件
- 收入影響消費(fèi)課件
- 支教興趣課課件
- 2025年會(huì)計(jì)電算化考試試題(含參考答案)
- 2024事業(yè)單位綜合基礎(chǔ)知識試題及答案
- 2025世界海洋日海洋知識競賽題及答案
- 2024年融媒體新聞采編技術(shù)應(yīng)用及理論知識考試題庫(附含答案)
- 2024年眩暈原發(fā)性高血壓中醫(yī)護(hù)理方案考核試題及答案
- 2024年北京市朝陽區(qū)社區(qū)工作者招聘考試題庫及答案解析
- 實(shí)戰(zhàn)能力評估模型-洞察及研究
- 智慧監(jiān)獄AI大模型數(shù)字化平臺規(guī)劃設(shè)計(jì)方案
- 危大工程安全智能化管理措施
- 內(nèi)能的利用單元練習(xí) 2025-2026學(xué)年物理人教版(2024)九年級全一冊
- 鐵路建設(shè)工程質(zhì)量安全監(jiān)督管理辦法
- 數(shù)字經(jīng)濟(jì)與市場結(jié)構(gòu)-洞察及研究
- 醫(yī)療器械經(jīng)營質(zhì)量管理規(guī)范培訓(xùn)
- DB42T 1496-2019 公路邊坡監(jiān)測技術(shù)規(guī)程
- 山水項(xiàng)目管護(hù)方案(3篇)
- 醫(yī)院直播策劃活動(dòng)方案
- 2025駕駛員交通安全培訓(xùn)
評論
0/150
提交評論