Last updated on 2014-09-12 01:48:12.
Flavor | Version | Tinstall | Tcheck | Ttotal | Status | Flags |
---|---|---|---|---|---|---|
r-devel-linux-x86_64-debian-clang | 0.2-2 | 4.07 | 57.50 | 61.56 | NOTE | |
r-devel-linux-x86_64-debian-gcc | 0.2-2 | 5.62 | 55.43 | 61.05 | NOTE | |
r-devel-linux-x86_64-fedora-clang | 0.2-2 | 149.92 | NOTE | |||
r-devel-linux-x86_64-fedora-gcc | 0.2-2 | 195.60 | NOTE | |||
r-devel-osx-x86_64-clang | 0.2-2 | 118.81 | NOTE | |||
r-devel-windows-ix86+x86_64 | 0.2-2 | 12.00 | 86.00 | 98.00 | NOTE | |
r-patched-linux-x86_64 | 0.2-2 | 5.90 | 59.36 | 65.26 | OK | |
r-patched-solaris-sparc | 0.2-2 | 806.90 | OK | |||
r-patched-solaris-x86 | 0.2-2 | 193.40 | OK | |||
r-release-linux-ix86 | 0.2-2 | 5.20 | 72.12 | 77.32 | OK | |
r-release-linux-x86_64 | 0.2-2 | 4.31 | 59.95 | 64.26 | OK | |
r-release-osx-x86_64-mavericks | 0.2-2 | OK | ||||
r-release-osx-x86_64-snowleopard | 0.2-2 | OK | ||||
r-release-windows-ix86+x86_64 | 0.2-2 | 12.00 | 96.00 | 108.00 | OK | |
r-oldrel-windows-ix86+x86_64 | 0.2-2 | 13.00 | 73.00 | 86.00 | ERROR |
Version: 0.2-2
Check: R code for possible problems
Result: NOTE
corpusToPrisma: no visible global function definition for
‘TermDocumentMatrix’
Flavors: r-devel-linux-x86_64-debian-clang, r-devel-linux-x86_64-debian-gcc, r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc, r-devel-osx-x86_64-clang, r-devel-windows-ix86+x86_64
Version: 0.2-2
Check: examples
Result: ERROR
Running examples in 'PRISMA-Ex.R' failed
The error most likely occurred in:
> ### Name: corpusToPrisma
> ### Title: Convert tm copus to PRISMA
> ### Aliases: corpusToPrisma
>
> ### ** Examples
>
> data(thesis)
> thesis
$content
$content[[1]]
$content
[1] " prisma method"
[2] ""
[3] ""
[4] ""
[5] ""
[6] " protocol inspect state machin"
[7] " analysi prisma"
[8] ""
[9] ""
[10] ""
[11] "given collect record traffic specif network servic"
[12] " goal prisma extract state machin associ"
[13] "templat rule describ inform flow messag"
[14] " messag first preprocess stage raw network"
[15] "traffic convert session contain messag method"
[16] "proceed follow step see figur"
[17] ""
[18] ""
[19] " find common structur first defin"
[20] " similar measur messag done"
[21] " messag special vector space "
[22] " reduc via statist test focus discrimin featur"
[23] " see"
[24] ""
[25] " proceed model session messag sequenc"
[26] " leverag embed previous step"
[27] " appli cluster"
[28] " group individu messag event see"
[29] ""
[30] " extract sequenc event can seen path"
[31] " protocol state machin infer approxim"
[32] " state machin use probabilist concept "
[33] " model transit link probabl see"
[34] ""
[35] ""
[36] " final automat generat "
[37] " messag associ state markov deriv"
[38] " describ inform flow "
[39] " differ state communic see"
[40] ""
[41] ""
[42] ""
[43] "throughout chapter use term atom"
[44] "exchang byte sequenc client server"
[45] " describ certain action client server side"
[46] " direct connect state machin model"
[47] "network servic messag structur consist"
[48] " sequenc token field describ messag"
[49] "flow field consecut templat instanti "
[50] "concret session"
[51] ""
[52] " network"
[53] ""
[54] ""
[55] " learn inner structur behavior specif network"
[56] "servic first collect suffici "
[57] "infer normal can done dedic sensor"
[58] "collect raw packet binari format instanc via"
[59] "tool apart payload packet contain"
[60] "sourc destin address actual reconstruct"
[61] "inform flow client server packet "
[62] " reassembl elimin artifact network transport"
[63] "layer task devis network record use"
[64] "matur librari reassembl tcp udp"
[65] "communic stream"
[66] ""
[67] " stream input session extractor generat"
[68] " reassembl packet payload specif session identifi"
[69] "accord sourc destin packet two packet"
[70] "occur small delay millisecond"
[71] "payload will merg specif session identifi "
[72] "sourc destin communic within"
[73] " millisecond correspond session will"
[74] "flag termin messag arriv "
[75] "specif sourcedestin combin will open new session"
[76] ""
[77] " network record session extractor preprocess raw network"
[78] "trace session contain messag follow will use"
[79] " preprocess subsequ step analysi"
[80] ""
[81] " messag"
[82] ""
[83] ""
[84] " preprocess messag x can model sequenc"
[85] "byte x b b 255"
[86] "infer common structur pool messag need similar"
[87] "measur capabl focus analysi discrimin"
[88] "featur account differ style like binari versus textual"
[89] "protocol introduc two differ embed can"
[90] "compress via statist test enabl focus analysi"
[91] ""
[92] " ngram"
[93] ""
[94] "one common approach domain natur languag process"
[95] " map byte sequenc finitedimen featur space whose"
[96] "dimens associ substr fix"
[97] "length n formal can describ substr w bn"
[98] " defin embed function b"
[99] " follow"
[100] ""
[101] " w"
[102] ""
[103] ""
[104] " simpli record whether specif ngram w occur "
[105] "given string instanc 0"
[106] ""
[107] " 0t"
[108] "n3 exampl can see correspond featur"
[109] "space finit high dimension howev space"
[110] "general spars popul allow effici"
[111] "represent"
[112] ""
[113] " token"
[114] ""
[115] "anoth wellknown concept domain natur languag"
[116] "process token byte sequenc via predefin"
[117] "separ charact s embed b"
[118] " map byte sequenc featur vector"
[119] "record occurr possibl word w accord "
[120] "separ w exampl"
[121] " consid separ s"
[122] "get follow embed ho let go 0"
[123] ""
[124] " 0t"
[125] " similar ngram embed dimens"
[126] " result featur space larg spars popul"
[127] "therefor effici storag model also avail"
[128] ""
[129] " reduct"
[130] ""
[131] " find structur network communic analysi "
[132] "focus featur discrimin messag "
[133] "pool volatil featur like random generat nonc cooki"
[134] "will occur lead unnecessari bloat vector"
[135] "space hold true constant token protocol sinc"
[136] " occurr messag will almost certain"
[137] ""
[138] "consequ use statist testdriven dimens reduct"
[139] " allow us split featur space follow f"
[140] "fconstant fvariabl fvolatil end"
[141] "appli binomi see appendix featur"
[142] "whether distribut frequenc approxim 1"
[143] "correspond constant featur 0 volatil featur"
[144] "respect applic multipl test correct"
[145] " keep featur constant"
[146] " volatil given statist signific level "
[147] "005 simplifi featur space group togeth"
[148] "featur exhibit correl near one"
[149] ""
[150] "given embed dimens reduct techniqu now"
[151] "abl defin datadriven featur space messag allow"
[152] "us introduc geometr concept like metric see"
[153] "appendix open whole field"
[154] "machin learn tool appli network communic "
[155] "follow will assum total number featur f"
[156] "w dimens reduct method keep just f"
[157] " w featur embed function adjust"
[158] "accord ie silent return just f reduc"
[159] "featur"
[160] ""
[161] " event infer"
[162] ""
[163] ""
[164] "messag occur specif event flow communic"
[165] "often exhibit similar structur featur thus extract event"
[166] "inform can exploit structur depend insid vector"
[167] "space can defin metric captur notion similar"
[168] " two messag instanc euclidean metric"
[169] " y w"
[170] "calcul distanc two point base occurr"
[171] " w word contain corpus use metric"
[172] "cluster algorithm can appli extract common structur "
[173] " pool therebi indirect recov under event"
[174] "inform"
[175] ""
[176] " infer structur network protocol trace suggest two"
[177] "possibl cluster techniqu one protocol assembl"
[178] " part one monolith communic token"
[179] "weight accord absolut posit "
[180] "messag obvious experiment free choos"
[181] "appropri cluster techniqu hand found"
[182] " two method work best protocol describ"
[183] "kind"
[184] ""
[185] " cluster via matrix factor"
[186] ""
[187] " map network payload vector space induc geometri"
[188] "reflect characterist captur reduc number featur"
[189] " f instanc payload share sever substr appear"
[190] "close wherea network payload differ content"
[191] "exhibit larger geometr distanc vectori represent"
[192] "network enabl us identifi semant compon"
[193] " see appendix"
[194] "particular appli concept matrix factor"
[195] "identifi base direct vector space given "
[196] "payload p pn first defin matrix"
[197] "contain vector p column"
[198] ""
[199] " f n"
[200] ""
[201] ""
[202] " determin compon seek represent "
[203] "retain inform describ term base"
[204] "direct can achiev term matrix factor"
[205] " two matric b f e c"
[206] " n e f"
[207] ""
[208] ""
[209] " bc"
[210] ""
[211] " b1"
[212] ""
[213] ""
[214] " c1 cn"
[215] ""
[216] ""
[217] ""
[218] " column b1 f b form new basi"
[219] " n payload dimens base direct"
[220] "bi associ featur w show"
[221] " later experi relat base direct "
[222] "featur can exploit construct templat matrix"
[223] "factor column c1 c form"
[224] "new coordin payload lowdimension"
[225] "space coordin can use visual "
[226] " lowdimension space"
[227] ""
[228] "anoth interest interpret matrix factor model"
[229] " domain document cluster accord "
[230] "follow document generat accord under"
[231] "latent topic zj j e topic generat"
[232] "specif word probabl pwizj specif document"
[233] "assign topic probabl pzjd probabl"
[234] " specif word occur document d can express"
[235] "follow"
[236] ""
[237] "pwid"
[238] ""
[239] " matrix factor b c can also interpret "
[240] "probabilist topic row b column c satisfi"
[241] " axiom probabl see appendix"
[242] ""
[243] " general matrix factor method differ constraint"
[244] "impos matric b c chapter studi two"
[245] "standard techniqu wide use field statist "
[246] "analysi princip compon analysi pca"
[247] "nonneg matrix factor nmf"
[248] ""
[249] " compon analysi pca"
[250] ""
[251] " pca seek base direct orthogon captur"
[252] "much varianc insid possibl formal ith"
[253] "direct bi consecut maxim varianc bi"
[254] " constraint base direct mutual"
[255] "orthonorm"
[256] ""
[257] "bi b"
[258] " b bj"
[259] ""
[260] ""
[261] " set matrix factor correspond"
[262] " singular valu decomposit l orthonorm basi"
[263] "vector b equal first l leftsingular vector "
[264] " coordin c correspond first rightsingular vector"
[265] " multipli singular valu"
[266] " pca entri b typic nonzero henc"
[267] "featur contribut basi vector bi"
[268] ""
[269] " matrix factor"
[270] ""
[271] ""
[272] "nonneg matrix factor nmf describ "
[273] "approxim whole embed matrix fn contain"
[274] "n point f reduc featur two strict"
[275] "posit matric b fe c"
[276] ""
[277] " bc bc"
[278] " c b c"
[279] " bij 0 0"
[280] ""
[281] " inner dimens e matrix product b c chosen"
[282] " e f lead even compact represent due"
[283] " posit constraint matrix b can interpret new"
[284] "basi messag matrix c contain"
[285] " coordin newli span space "
[286] " differ part coordin use ultim assign"
[287] " messag cluster find posit maxim"
[288] "weight shown nmf equival latent"
[289] "semant index special kind topic"
[290] " render nmf especi use context document"
[291] "cluster"
[292] ""
[293] " sever possibl solut solv equat"
[294] "see instanc lee99 hoyer04 heiler06 "
[295] "stick practic implement introduc "
[296] "base least squar approach altern"
[297] "solv follow constraint least squar problem given"
[298] "regular constant"
[299] ""
[300] ""
[301] " b c"
[302] " c b"
[303] ""
[304] ""
[305] " correspond solut"
[306] ""
[307] ""
[308] "c b b"
[309] "b c c"
[310] ""
[311] ""
[312] " regular constant can treat metaparamet "
[313] "procedur choos crossvali sinc "
[314] "number featur number sampl matrix can get"
[315] "quit larg instanc ftp introduc later contain"
[316] "rough 18 million sampl 90000 featur direct calcul"
[317] "equat often infeas"
[318] ""
[319] "therefor devis reduc equival problem take"
[320] "account dimens reduct step"
[321] " duplic matrix"
[322] " ie denot "
[323] " f n matrix without duplic"
[324] "column simplif equat note"
[325] ""
[326] ""
[327] "ci b b ai"
[328] ""
[329] "henc can replac "
[330] "equat obtain c duplic"
[331] " result ci accord retriev c "
[332] "simplif equat note"
[333] ""
[334] " c b c b 2"
[335] ""
[336] " w diagon matrix consist"
[337] " number duplic correspond column"
[338] " shown optim"
[339] "problem equat right side"
[340] "equat new object can solv"
[341] ""
[342] "b c w c c w"
[343] ""
[344] " two simplif allow us appli nmf even larg"
[345] "set reduct accuraci appendix"
[346] "describ replicateawar version nmf algorithm detail"
[347] "togeth estim heurist inner dimens e"
[348] " appli initi scheme"
[349] ""
[350] " cluster"
[351] ""
[352] " matrix factor techniqu good choic protocol"
[353] " messag construct part protocol show"
[354] "positiondepend featur sinc cluster step prisma"
[355] "total independ concret algorithm use long "
[356] "procedur assign cluster label messag experiment"
[357] " fix matrix factor method free choos"
[358] "appropri cluster tool choic take posit depend"
[359] "featur account propos weight distanc measur"
[360] ""
[361] "dwx y w 101pw x 101pw y"
[362] ""
[363] " pw x return posit token w string x"
[364] "distanc measur can use calcul distanc matrix d"
[365] " subsequ form input singl linkag hierarch"
[366] "cluster note can also restrict calcul "
[367] "distanc matrix reduc matrix "
[368] " save comput time also keep size d reason"
[369] "rang"
[370] ""
[371] " statemachin"
[372] ""
[373] ""
[374] "network communic driven under state machin"
[375] " certain event trigger certain respons switch"
[376] "proceed state sometim switch probabilist natur"
[377] " instanc servic temporarili unavail can model"
[378] " instanc login procedur 90 attempt"
[379] "success"
[380] ""
[381] "one possibl way state machin network servic "
[382] "probabilist way markov unobserv"
[383] "state correspond intern logic state servic"
[384] " messag sent network correspond emit"
[385] "symbol use baumwelch algorithm enough"
[386] " servic communic possibl estim"
[387] "under hidden markov describ observ howev"
[388] "baumwelch algorithm guarante found"
[389] " highest likelihood global sens"
[390] ""
[391] " markov"
[392] ""
[393] "instead direct tri infer under hidden markov"
[394] " start regular markov will later"
[395] "simplifi minim hidden variant whole learn process"
[396] "therefor determinist inher random like"
[397] "initi matric baumwelch algorithm see"
[398] "appendix detail approach"
[399] "circumv problem find potenti nonoptim "
[400] "determin come price well known fact hidden"
[401] "markov model power regular one"
[402] "summari trade potenti uncertainti decreas "
[403] "complex therefor regular hypothes space"
[404] ""
[405] "given session inform preprocess step label"
[406] "inform messag event cluster direct"
[407] "learn regular markov event sequenc estim initi"
[408] " transit probabl maximum likelihood"
[409] "estim howev simpl markov drop"
[410] "direct event ie event trigger client"
[411] " server limit histori one messag due markov"
[412] "assumpt ie generat next event depend just "
[413] "previous one especi last limit strict"
[414] "network communic sinc loos context "
[415] "messag generat"
[416] ""
[417] " state space"
[418] ""
[419] " circumv limit regular markov "
[420] "use convolut com version event"
[421] "sequenc session follow"
[422] ""
[423] " event will annot reflect whether "
[424] " generat client server side"
[425] " horizon k convolut annot pad"
[426] " event sequenc slide window size k record"
[427] " occur ktupl"
[428] ""
[429] ""
[430] " exampl assum observ event sequenc b c d"
[431] " messag generat altern client"
[432] "server horizon k 2 convolut event"
[433] "sequenc "
[434] " ac ac bs bs cc cc ds"
[435] " new convolut event space e will contain 2e"
[436] " 1k potenti event "
[437] " start state calcul transit"
[438] "probabl new convolut event space e"
[439] " maximum likelihood estim specifi regular markov"
[440] " annot event horizon k"
[441] ""
[442] " markov"
[443] ""
[444] " client server communic horizon least k2"
[445] "necessari keep communic context involv"
[446] "process even higher horizon might necessari lead "
[447] "exponenti growth possibl state will see evalu"
[448] " real network communic convolut state"
[449] "space often spars popul yet result network can"
[450] " larg make introspect human user difficult"
[451] ""
[452] " remedi propos follow minim algorithm boil"
[453] " size markov preserv overal"
[454] "capabl"
[455] ""
[456] " transform markov m determinist finit automaton"
[457] " dfa m"
[458] ""
[459] " keep transit probabl bigger zero"
[460] " associ state"
[461] " transit dfa m accept new event"
[462] " second state exampl transit connect state"
[463] " ac bs state bs cc consum event cc"
[464] ""
[465] " appli dfa minim algorithm introduc"
[466] " get equival dfa m "
[467] " minim number state accept languag"
[468] " side effect algorithm return assign"
[469] " e m origin state "
[470] " convolut event space e compress state"
[471] " dfa m"
[472] ""
[473] ""
[474] " result dfa m can use inspect "
[475] "under state can interpret special hidden"
[476] "markov instead observ convolut event"
[477] "e now observ state m accord "
[478] "assign e m found minim"
[479] "algorithm metast subsum equival state will"
[480] "therefor lead accept event sequenc "
[481] "origin will show evalu "
[482] "simplifi model drastic decreas size therefor"
[483] "good candid analysi state machin human"
[484] "administr"
[485] ""
[486] " templat rule"
[487] ""
[488] ""
[489] " session can seen sequenc event trigger"
[490] "specif state switch state machin learn general"
[491] "inform flow process general messag"
[492] "associ state templat consist fix"
[493] "variabl part often fill content previous"
[494] "messag method like matrix factor allow "
[495] "extract static templat without take dynam inform"
[496] "account session inform can use even extract fill"
[497] "rule exploit learn markov now readi"
[498] "give procedur reliabl extract templat rule "
[499] "network servic hand"
[500] ""
[501] " templat"
[502] ""
[503] " event cluster step focus variabl yet neither constant"
[504] " volatil featur identifi common pattern exchang"
[505] "messag focus make sens identif"
[506] "under event essenti featur back "
[507] "generat valid protocolconform messag"
[508] ""
[509] ""
[510] ""
[511] ""
[512] " state state b state c"
[513] ""
[514] " session 1 ftp 314 user anon 331 user anon ok"
[515] " session 2 ftp 312 user ren 331 user ren ok"
[516] ""
[517] " session n ftp 20 user liz 331 user"
[518] " liz ok"
[519] ""
[520] " templat ftp user 331 user"
[521] " ok"
[522] ""
[523] ""
[524] " templat generat simplifi ftp communic"
[525] ""
[526] ""
[527] ""
[528] " addit aspect extract generic messag templat"
[529] " under state machin analyz servic "
[530] "like exchang messag correl current state"
[531] " servic thus valid assumpt assign"
[532] "messag session accord state previous"
[533] "extract state machin shown artifici exampl"
[534] "figur look recur token "
[535] "state generic templat can construct contain fix passag"
[536] " variabl field accord distribut learn"
[537] "pool"
[538] ""
[539] " detail templat infer procedur structur"
[540] "follow"
[541] ""
[542] " token messag accord previous chosen embed"
[543] " assign messag session state infer"
[544] " markov"
[545] " state markov"
[546] ""
[547] " analyz token messag assign state "
[548] " number token"
[549] " group assign messag number token process"
[550] " group"
[551] " messag group contain token "
[552] " specif posit fix token record result"
[553] " templat otherwis variabl field save"
[554] ""
[555] ""
[556] " end procedur will templat state"
[557] " markov repres generic messag might"
[558] "occur note state might sever differ templat"
[559] "assign accord observ length distribut ie"
[560] "simplifi multipl align procedur extract"
[561] "generic templat focus align messag "
[562] "length"
[563] ""
[564] " rule"
[565] ""
[566] "find rule fill specif field templat"
[567] "accord previous seen messag now amount simpl yet"
[568] "power combin markov extract templat"
[569] "session inform possibl combin templat"
[570] "occurr horizon length k ie t1 t2 tk"
[571] ""
[572] " find messag assign k templat"
[573] " occur session exact order"
[574] " field f templat tk"
[575] ""
[576] " look rule fill f field content f"
[577] " f templat t1 t2 tk f session"
[578] " rule match just record token occur"
[579] " train pool rule"
[580] ""
[581] ""
[582] ""
[583] " check rule describ tabl"
[584] " procedur ensur inform found preced"
[585] "messag can systemat reproduc content follow"
[586] "messag f case will get copi instanc"
[587] " exampl shown figur can observ"
[588] " case field templat associ state c can"
[589] "fill field previous messag rule act"
[590] " fallback solut match found "
[591] "pumpprim rule first messag session"
[592] ""
[593] ""
[594] ""
[595] ""
[596] ""
[597] " rule descript"
[598] ""
[599] " copi exact copi content one field anoth"
[600] " seq copi numer field increment d"
[601] " add copi content field add d front back"
[602] " part copi front back part field split separ"
[603] " s"
[604] " fill field random pick previous seen d"
[605] ""
[606] ""
[607] ""
[608] " check build paramet"
[609] " like d s automat infer "
[610] " train"
[611] ""
[612] ""
[613] ""
[614] " network communic"
[615] ""
[616] " infer prisma now contain three part actual markov"
[617] " infer templat rule set associ "
[618] "templat use part prisma simul "
[619] "communic devis live essenc network sensor len"
[620] "depict algorithm addit infer"
[621] " part modul initi role client server"
[622] " simul note prisma role"
[623] "agnost therefor can use simul side "
[624] "communic allow us even let talk "
[625] "run two instanc len differ role pass"
[626] "messag generat one instanc vice versa"
[627] ""
[628] "appendix give complet exampl prisma"
[629] "base simpl toy problem given network trace robot"
[630] "communic environ behavior learn via prisma"
[631] ""
[632] ""
[633] " live essenc network sensor len"
[634] ""
[635] ""
[636] " templat rule role"
[637] " activ"
[638] " wait messag timeout t"
[639] " m receiv"
[640] " find match templat t accord "
[641] " current state"
[642] " split messag m accord t field"
[643] " switch state state associ t"
[644] ""
[645] " random choos state s accord "
[646] " transit probabl markovmodel"
[647] " accord role"
[648] " find rule accord previous"
[649] " k horizon templat"
[650] " appli rule fill new templat form"
[651] " messag"
[652] " send messag"
[653] " current state s"
[654] ""
[655] ""
[656] ""
[657] ""
[658] ""
[659] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file004.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[2]]
$content
[1] " static content"
[2] ""
[3] ""
[4] " present prisma method turn empir evalu"
[5] " capabl differ secur applic evalu"
[6] " first part process chain name embed"
[7] "matrix factor step restrict evalu static"
[8] "content apart give valuabl insight inner work"
[9] " first step method show extract compon"
[10] " matrix factor can use construct stateless"
[11] "templat result base direct inclus"
[12] "dynam inform form session inform enabl us"
[13] " refin"
[14] "templat accord occurr workflow "
[15] "session evalu static content first studi"
[16] "framework toy allow us establish"
[17] "understand static templat infer communic"
[18] "compar perform differ matrix factor method"
[19] " proceed realworld"
[20] "applic network trace contain malici communic"
[21] " analyz static templat exploit vulner"
[22] "attack sourc final"
[23] "appli method context network anomali detect"
[24] "limit process network reduc space give us"
[25] " huge perform increas attain accuraci full"
[26] " space"
[27] ""
[28] ""
[29] " factor method"
[30] ""
[31] ""
[32] " first experi consid artifici http"
[33] "communic total control protocol syntax"
[34] "semant simul web applic support three differ"
[35] "type request whose network payload depict"
[36] "figur first payload reflect request"
[37] "static content second payload resembl search queri "
[38] "last payload correspond administr request "
[39] "action paramet one follow"
[40] " request equip random"
[41] "part name static web page search string "
[42] "administr paramet simul usual fluctuat web"
[43] "traffic"
[44] ""
[45] ""
[46] ""
[47] ""
[48] " request static content"
[49] " labelpositionbottomlin"
[50] " get static3lpan6c2html http11"
[51] " host wwwfoobarcom"
[52] " accept"
[53] ""
[54] ""
[55] " search queri"
[56] " labelpositionbottomlin"
[57] " get cgisearchphpseh0ykj3r3wd2i http11"
[58] " host wwwfoobarcom"
[59] " accept"
[60] ""
[61] ""
[62] " administr request"
[63] " labelpositionbottomlin"
[64] " get cgiadminphpactionrenamepardbjh7hs0r5 http11"
[65] " host wwwfoobarcom"
[66] " accept"
[67] ""
[68] ""
[69] " payload artifici "
[70] ""
[71] ""
[72] ""
[73] "use web applic generat 1000 network"
[74] "payload uniform distribut three request type"
[75] " appli first part prisma method detail"
[76] " "
[77] "use token basic string delimit select accord"
[78] " specif http"
[79] "d"
[80] "base extract featur appli matrix factor"
[81] "algorithm name princip compon analysi pca nonneg"
[82] "matrix factor nmf determin base direct "
[83] "vector space payload final construct static templat "
[84] "base direct"
[85] ""
[86] " present extract featur tabl"
[87] "featur consist 8 symbol correspond relev string"
[88] " under web applic constant volatil token "
[89] "filter token cooccur communic "
[90] "coupl indic oper note featur"
[91] " contain token relat http syntax therebi differ"
[92] " previous approach reconstruct protocol grammar"
[93] "specif"
[94] ""
[95] ""
[96] ""
[97] ""
[98] ""
[99] " symbol"
[100] ""
[101] " 1 5"
[102] " 2 6"
[103] " 3 s 7"
[104] " 4 adminphp par 8"
[105] ""
[106] ""
[107] " featur artifici string cooccur"
[108] " network payload coupl use oper"
[109] ""
[110] ""
[111] ""
[112] "result applic matrix factor algorithm "
[113] "artifici visual figur"
[114] " algorithm pca nmf base direct matrix b shown"
[115] " xaxi detail differ direct yaxi"
[116] "contribut individu featur symbol"
[117] ""
[118] ""
[119] ""
[120] " 70 10 110cliptruewidth49"
[121] " 70 10 110cliptruewidth49"
[122] " base pca nmf artifici"
[123] " color signifi intens entri rang"
[124] " 1 red 1 blue"
[125] ""
[126] ""
[127] ""
[128] " techniqu perform matrix factor payload"
[129] " matric differ signific pca yield posit"
[130] "negat contribut matrix b indic differ"
[131] "color although certain structur relat featur symbol"
[132] "may deduc matrix clear separ differ"
[133] "element possibl contrast nmf matrix show crisp"
[134] "represent base direct static search request"
[135] "clear reflect individu base direct remain base"
[136] "direct correspond administr request differ"
[137] "combin action type featur symbol "
[138] "correct identifi"
[139] ""
[140] "due superior perform restrict analysi base"
[141] "direct determin use nmf algorithm "
[142] "follow static templat result sole posit entri "
[143] "nmf matrix figur present"
[144] "tabl templat accur captur"
[145] "semant implement simpl web applic 7"
[146] "templat construct cover static access web content"
[147] "search queri differ administr task note two base"
[148] "direct figur ident result "
[149] "total 7 static templat templat even exhibit hierarch"
[150] "structur templat 3 resembl basic administr request"
[151] " follow templat special case particular"
[152] "administr action"
[153] ""
[154] ""
[155] ""
[156] ""
[157] ""
[158] " static templat"
[159] ""
[160] " 1"
[161] " 2 searchphp s"
[162] " 3 action adminphp par"
[163] " 4 action adminphp par move"
[164] " 5 action adminphp par renam"
[165] " 6 action adminphp par delet"
[166] " 7 action adminphp par show"
[167] ""
[168] ""
[169] " templat extract artifici "
[170] " templat construct use token basic string nmf"
[171] " matrix factor"
[172] ""
[173] ""
[174] ""
[175] " honeypot"
[176] ""
[177] ""
[178] "network honeypot proven use instrument"
[179] "identif analysi novel threat often howev"
[180] "amount collect honeypot huge manual"
[181] "inspect network payload becom tedious futil"
[182] "propos prisma method allow analyz larg set"
[183] "unknown traffic extract semant interest network featur"
[184] "automat"
[185] ""
[186] " illustr util framework network collect"
[187] "use webbas honeypot web"
[188] " applic honeypot httpglastopforg honeypot captur"
[189] "attack web applic remot file inclus rfi"
[190] " sql inject attack expos typic pattern vulner"
[191] "applic search engin honeypot deploy "
[192] "period 2 month collect averag 3400 request per day"
[193] " experi random pick 1000 request "
[194] "collect appli framework use token under"
[195] "featur particular extract 40 static templat use"
[196] "base direct identifi nmf embed http payload"
[197] "templat shown tabl"
[198] " note 12"
[199] " templat omit contain redund unspecif"
[200] " inform"
[201] ""
[202] ""
[203] ""
[204] ""
[205] ""
[206] " static templat descript"
[207] ""
[208] " shellz csptxt semant rfi malwar"
[209] " psybnc csptxt"
[210] " botz bottxt"
[211] " scannerz bottxt"
[212] ""
[213] " option http"
[214] " vulner virtuemart"
[215] " option itemid"
[216] " show"
[217] " exportphp phptxt"
[218] " skin standard"
[219] " vulner technot"
[220] " smiley http"
[221] " vulner gnuboard"
[222] " skin http"
[223] " file 1"
[224] " file http"
[225] " admin zefatxt"
[226] " http fx29id1txt"
[227] " appserv mainphp"
[228] " vulner appserv"
[229] " document media"
[230] " vulner php"
[231] " dir 1"
[232] " misc rfi vulner"
[233] " error phptxt bottxt"
[234] " indexphp rawtxt"
[235] " includ http"
[236] ""
[237] " com stealth"
[238] " sourc attack"
[239] " com compon"
[240] " wwwhfsborg site 10225 img"
[241] " eeng zefatxt"
[242] " zerozoncokr photo count"
[243] " musicadelibrerianet footer"
[244] " forum smiley"
[245] ""
[246] " indexphp file miscellan templat"
[247] " http path"
[248] " path rawtxt"
[249] " http zefatxt"
[250] " option itemid"
[251] " lib sourcedir"
[252] " zero http"
[253] " http id"
[254] " http bjorktxt"
[255] " media path"
[256] " bottxt skin"
[257] " http id"
[258] ""
[259] ""
[260] " templat honeypot templat "
[261] " construct use token basic string nmf matrix"
[262] " factor"
[263] ""
[264] ""
[265] ""
[266] " extract static templat can classifi three categori"
[267] " malwar "
[268] " sourc exampl first templat reflect differ option"
[269] "suppli webbas malwar malici function "
[270] "prepar remot shell set irc bouncer"
[271] " scan vulner host"
[272] "clear manifest string templat follow"
[273] "templat character vulner web applic includ"
[274] "correspond file paramet name follow templat"
[275] "correspond domain host name use sourc remot file"
[276] "inclus often origin host also part "
[277] "complet url discov final method also extract"
[278] " miscellan direct map"
[279] "vulner"
[280] ""
[281] "note although templat generat raw"
[282] "http traffic syntact protocolspecif string "
[283] "extract demonstr abil prisma focus semant"
[284] "communic"
[285] ""
[286] " embed"
[287] ""
[288] ""
[289] " show effici embed evalu capabl"
[290] " prisma applic network intrus"
[291] "detect techniqu anomali detect frequent appli"
[292] "extens signaturebas intrus detect system "
[293] "popular system sinc"
[294] "enabl identif unknown novel network threat"
[295] ""
[296] " evalu intrus detect perform consid"
[297] "three larger set network payload first "
[298] "first08 contain http request monitor web server "
[299] "research institut period 60 day second "
[300] "blog09 contain request sever web blog run popular"
[301] "platform span 33 day traffic third "
[302] "ftp03 compris client side request ftp session record"
[303] " 10 day lawrenc berkeley nation laboratori"
[304] " addit benign inject network"
[305] "attack traffic attack execut virtual"
[306] "environ use common tool penetr test "
[307] " care adapt match characterist "
[308] " set see detail"
[309] ""
[310] " experi appli detect method similar work"
[311] "wang et al centroid normal"
[312] "network payload construct use ngram"
[313] " use identifi unusu"
[314] "network content addit consid second "
[315] "ngram refin use reduc vector space formal"
[316] "calcul matrix factor bc construct "
[317] " follow b"
[318] "ci calcul centroid lowerdimension space"
[319] "obtain first 20 base direct nmf two model"
[320] "train 1000 random drawn payload anomali"
[321] "detect perform 200000 random chosen http request"
[322] "20000 ftp session respect"
[323] ""
[324] "result shown receiv oper characterist roc curv"
[325] " figur perform full reduc"
[326] "centroid ident three set demonstr"
[327] " base direct identifi nmf captur semant"
[328] "inform under protocol suffici detect"
[329] "anomali attack figur detail runtim"
[330] "perform attain differ model reduc analysi"
[331] "use lowerdimension space provid signific perform"
[332] "gain regular anomali detect speedup factor 815"
[333] "can observ clear indic util prisma"
[334] "preprocess step anomali detect"
[335] ""
[336] ""
[337] " curv"
[338] " perform"
[339] " curv runtim network anomali detect"
[340] " templat construct use 4gram basic string"
[341] " nmf matrix factor"
[342] ""
[343] ""
[344] ""
[345] ""
[346] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file005.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[3]]
$content
[1] " dynam content"
[2] ""
[3] ""
[4] " show prisma method benefit "
[5] "incorpor dynam inform demonstr prisma capabl"
[6] "learn also simul network communic real network"
[7] "trace end use sever network trace record via"
[8] " plug one part process pipelin"
[9] " check qualiti accord remain"
[10] " syntact semant featur simul"
[11] "session ensur evalu prisma reallif"
[12] "condit"
[13] ""
[14] " use raw test preprocess tool"
[15] " effici featur dimens reduct"
[16] " comparison heldout session assur"
[17] " model mean learn model capabl replay"
[18] " real session observ pool"
[19] " check syntact semant featur simul"
[20] " session guarante model communic"
[21] " perspect"
[22] ""
[23] ""
[24] " introduc set discuss"
[25] " result featur space dimens reduct look"
[26] " general properti learn pris model"
[27] " complet correct "
[28] "model conclud evalu "
[29] " case studi malwar analysi show pris can use"
[30] "applic domain beyond honeypot"
[31] ""
[32] " set dimens reduct"
[33] ""
[34] ""
[35] " evalu prisma framework chosen three"
[36] "repres set two textbas one"
[37] "pure binari"
[38] ""
[39] " sip record real medium size telephoni"
[40] " infrastructur contain rough 7 day communic 20"
[41] " particip differ session initi protocol sip"
[42] " client"
[43] " dns domain name system dns request home network"
[44] " 7 differ client collect one day heavi use"
[45] " ftp file transfer protocol ftp "
[46] " lawrenc berkeley nation laboratori contain"
[47] " client server request 10 day communic"
[48] ""
[49] ""
[50] "natur set vari size sip "
[51] "mediums pool rough 35000 messag dns "
[52] "contain just 6000 messag ftp compris near 18"
[53] "million messag render biggest evalu"
[54] " accommod differ properti set appli"
[55] "differ embed sinc sip ftp consist humanread"
[56] "text can token usual white space charact due"
[57] " binari layout dns token approach"
[58] " feasibl therefor chosen 2gram embed"
[59] "dns set use 90 learn prisma"
[60] " keep remain evalu carri "
[61] ""
[62] ""
[63] " latex tabl generat r 2150 xtabl 156 packag"
[64] " thu jun 14 143150 2012"
[65] ""
[66] ""
[67] ""
[68] ""
[69] " size dimens kept"
[70] " uniqu"
[71] ""
[72] " sip 34958 72937 039 258"
[73] " dns 5539 6625 1315 3564"
[74] " ftp 1760824 87140 217 024"
[75] ""
[76] ""
[77] ""
[78] " set give total number"
[79] " messag number featur"
[80] " dimens reduct step kept"
[81] " uniqu give percentag featur messag"
[82] " kept dimens reduct step"
[83] ""
[84] ""
[85] ""
[86] " result featur dimension reduct uniqu messag"
[87] " shown tabl first thing note "
[88] " dimens reduct step show excel behavior"
[89] "relat number kept featur rang 04 sip 132"
[90] " dns 22 ftp show extrem focus"
[91] " eman dimension reduct direct consequ"
[92] " relat number uniqu messag "
[93] "rang 26 sip 356 dns 02"
[94] "ftp strike differ dns sipftp term"
[95] "reduct can clear explain differ conceptu layout"
[96] " languag high compress binari format dns"
[97] "protocol leav less room optim featur space"
[98] "therefor also number uniqu messag dimens"
[99] "reduct higher compar textbas protocol"
[100] ""
[101] "overal see dimens reduct high effect"
[102] "even binari protocol focus vari part"
[103] " messag uniqu messag reduc featur"
[104] "space valuabl comput time can save render prisma"
[105] "approach capabl model even big collect"
[106] ""
[107] " learn model"
[108] ""
[109] ""
[110] "follow embed dimension reduct step appli"
[111] "event cluster step describ "
[112] " sip dns appli nmf cluster algorithm"
[113] "sinc quick inspect show "
[114] "partwholerelationship under nmf algorithm hold "
[115] "two set relat short ftp messag follow less"
[116] "fix setup render positiondepend cluster approach"
[117] "better suit kind "
[118] ""
[119] ""
[120] ""
[121] ""
[122] ""
[123] " node coverag min dfa"
[124] " coverag"
[125] ""
[126] " sip 148 145 100 98"
[127] " dns 381 08 153 03"
[128] " ftp 1305 08 653 04"
[129] ""
[130] ""
[131] ""
[132] " node prisma model "
[133] " unoptim markov minim dfa relat"
[134] " number potenti number node possibl"
[135] ""
[136] ""
[137] ""
[138] "tabl summar number node extract"
[139] "markov relat number "
[140] "potenti number node attain describ"
[141] " see total number node"
[142] " sip smallest yet relat coverag"
[143] "highest dns ftp absolut number node higher"
[144] " relat coverag potenti node space spars"
[145] "indic inher depend relat coverag"
[146] " estim number cluster applic dfa minim"
[147] "algorithm markov signific reduc number"
[148] "node model convert result network"
[149] "dimens manag human user"
[150] ""
[151] ""
[152] ""
[153] ""
[154] ""
[155] " copi seq"
[156] " add part "
[157] " total"
[158] ""
[159] " sip 1916 77 135 52 1793 3972"
[160] " dns 3142 4 0 0 3527 6673"
[161] " ftp 532 18 253 35 4671 5509"
[162] ""
[163] ""
[164] ""
[165] " differ rule prisma model extract"
[166] " differ set"
[167] ""
[168] ""
[169] ""
[170] " correspond number rule shown"
[171] "tabl note ngram embed "
[172] " rule deactiv sinc alreadi handl "
[173] " rule see rule repres sip"
[174] " exhibit higher number involv rule compar"
[175] " set reflect high redund structur "
[176] "protocol dns ftp inher variabl part server"
[177] "name dns file name ftp result higher"
[178] "number rule compar sip"
[179] ""
[180] " lal evaluationlensagainstlensdata20031009lbnltrain"
[181] " savetransisitontru"
[182] " s lalplay"
[183] " zipslensgen sreplayedmsg"
[184] " 11691"
[185] " 1691495"
[186] " 220 domain ftp server version wu2621 mon dec 30 165835 pst 2001 readi"
[187] " 691495326 user anonym"
[188] " 495326385"
[189] " 331 guest login ok send complet email address password"
[190] " 326385197 pass password"
[191] " 38519720 230 guest login ok access restrict appli"
[192] " 19720650 type"
[193] " 20650423 200 type "
[194] " 6504231660 pasv"
[195] " 4231660179 227 enter passiv mode 1312431109240"
[196] " 16601791627 retr groffperl11814i386rpm"
[197] " 17916271525"
[198] " 150 open binari mode connect groffperl11814i386rpm 56 byte"
[199] " 16271525765"
[200] ""
[201] ""
[202] ""
[203] ""
[204] " ftp server 2001 readi"
[205] ""
[206] " guest login ok send complet email address password"
[207] " password"
[208] " guest login access restrict appli"
[209] ""
[210] ""
[211] ""
[212] " enter mode"
[213] ""
[214] " open mode connect byte"
[215] ""
[216] " ftp session generat execut two prisma model"
[217] " one client one server field"
[218] " mark box exact copi rule fill gray"
[219] ""
[220] ""
[221] ""
[222] "figur give visual impress learn"
[223] " ftp generat session simul"
[224] "side communic prisma learn ftp"
[225] " one execut act client "
[226] "one act server see result log session"
[227] " generat valid ftp start initi login"
[228] "procedur client set communic"
[229] "binari enter passiv mode get file "
[230] "server note name file client request"
[231] "copi correspond repli server show"
[232] "power infer rule obvious byte size 56 "
[233] "proper size request file sinc chosen random"
[234] " rule messag valid ftp repli"
[235] "show abil prisma even generat new messag seen"
[236] " train pool"
[237] ""
[238] " correct"
[239] ""
[240] ""
[241] " previous figur exampl show prisma"
[242] "method produc relat condens model embed"
[243] "space state machin question regard complet"
[244] "correct model treat "
[245] ""
[246] ""
[247] ""
[248] " judg model take 10 "
[249] "heldout simul either client server side"
[250] "evalu whether learn contain path "
[251] "generat session resembl sinc"
[252] "transit probabilist ensur "
[253] "path choos simul synchron actual"
[254] "content session instanc session might contain"
[255] "specif branch state machin occur just 5 "
[256] "time like server overload error repli like allevi"
[257] "probabilist effect repeat simul 100 time"
[258] "introduc determin feed first two messag"
[259] " session state first two"
[260] "messag exchang align"
[261] ""
[262] " "
[263] " shortcut use keep number repetit low "
[264] " achiev effect just increment number"
[265] " repetit unfortun render simul "
[266] " ftp infeas due vast amount "
[267] ""
[268] " result simul report"
[269] "figur use normal dameraulevenshtein"
[270] "distanc similar 1 mean equal count"
[271] "number insert delet substitut necessari"
[272] "transform one string anoth posit session"
[273] "take maximum attain similar repetit take"
[274] "account probabilist effect describ"
[275] ""
[276] " normal done follow denot d"
[277] " number insert delet substitut l"
[278] " total length string similar "
[279] " sl return 1 special case l equal 0"
[280] ""
[281] ""
[282] ""
[283] " "
[284] ""
[285] " "
[286] ""
[287] " "
[288] ""
[289] " maxim similar messag posit"
[290] " session replay record normal edit distanc give"
[291] " 10 equal messag size black bar"
[292] " correspond frequenc equal messag size"
[293] " dark gray bar similar rang 09 10"
[294] " light gray bar similar rang 075 09"
[295] ""
[296] ""
[297] ""
[298] " sip observ number equal messag"
[299] "rang 80 60 similar score almost never"
[300] " 09 show learn model can correct remodel"
[301] "holdout session dns behavior similar show"
[302] "varianc due relat low number session 6"
[303] "messag ftp show even better perform "
[304] "prisma near messag show equal"
[305] " posit six frequenc exact resembl stay alway"
[306] " 70 show even complex protocol can accur"
[307] "simul four step"
[308] ""
[309] ""
[310] ""
[311] "next focus syntact semant"
[312] " generat messag syntact correct util"
[313] "protocol filter network protocol analyz"
[314] " ftp protocol check"
[315] "valid command manual accord rfcs"
[316] " check semant correct"
[317] " appli follow rule"
[318] ""
[319] " sip messag session check whether"
[320] " preserv"
[321] " sinc tripl valu identifi sipsess"
[322] " dns messag session repli check whether"
[323] " queri session queri id"
[324] " ftp ftp request check whether request"
[325] " return repli code valid one accord rfcs"
[326] ""
[327] " session count number syntact semant"
[328] "correct messag report relat frequenc correct messag"
[329] " complet session addit session generat "
[330] "complet evalu denot also"
[331] "simul 100000 session side generat "
[332] "denot "
[333] ""
[334] ""
[335] ""
[336] ""
[337] ""
[338] " syntax"
[339] " semant"
[340] ""
[341] " unidir bidir unidir"
[342] " bidir"
[343] ""
[344] " sip 1000 1000 0988 0945"
[345] " dns 1000 1000 1000 0994"
[346] " ftp 0999 0821 0934 0576"
[347] ""
[348] ""
[349] ""
[350] " session 100 syntact"
[351] " semant correct messag"
[352] " differ simul paradigm uni bidirect"
[353] ""
[354] ""
[355] ""
[356] " result shown tabl syntact"
[357] "correct session almost alway perfect "
[358] "bidirect simul ftp show relat"
[359] "declin just 82 percent session total"
[360] "correct regard semant dns show also near perfect"
[361] "behavior perform sip 98 94"
[362] " session total correct uni bidirect"
[363] "simul respect also good rang "
[364] "semant ftp unidirect case show good behavior"
[365] " perform declin bidirect simul just 57"
[366] " session total correct sinc ftp session tend "
[367] "long investig correct detail"
[368] "tabl split frequenc bin "
[369] "observ bulk session 80 correct"
[370] "messag combin higher length ftp"
[371] "session show even difficult potenti vast"
[372] "communic pattern prisma abl captur "
[373] "syntax semant communic"
[374] ""
[375] ""
[376] ""
[377] ""
[378] ""
[379] " syntax"
[380] " semant"
[381] ""
[382] " msgs correct unidir bidir"
[383] " unidir bidir"
[384] ""
[385] " 100 0999 0821 0934 0576"
[386] " 90 1000 0953 0988 0878"
[387] " 80 1000 0996 1000 0982"
[388] ""
[389] ""
[390] ""
[391] " cumul syntact semant correct session"
[392] " ftp"
[393] ""
[394] ""
[395] ""
[396] " summari evalu show infer pris model"
[397] " compact show high degre complet well"
[398] "syntact semant correct render model"
[399] "readi deploy reallif network infrastructur act"
[400] " honeypot specif design occur traffic "
[401] "network contact honeypot can held high number"
[402] " step gather indepth inform behavior intent"
[403] " potenti intrud inform use"
[404] "estim threat potenti infrastructur given time"
[405] "point also learn attack mischief"
[406] "conduct"
[407] ""
[408] " studi koobfac"
[409] ""
[410] ""
[411] " appli prisma network traffic collect"
[412] "malici softwar pick one specif class"
[413] "malwar instanc use token embed partbas"
[414] "cluster total 147 session 6674 messag"
[415] "detail result depict"
[416] "figur"
[417] ""
[418] ""
[419] ""
[420] ""
[421] " state koobfac traffic upper part"
[422] " correspond scan phase malwar"
[423] " middl part handshak procedur infect machin"
[424] " malwar get new list malwar final"
[425] " download lower part state machin"
[426] ""
[427] ""
[428] ""
[429] " upper part see scan loop malwar"
[430] "tri find infect server long server "
[431] "answer specif format scan continu malwar"
[432] " receiv correct repli state fs handshak procedur"
[433] " malwar server take place follow "
[434] "download cycl state ic malwar start download"
[435] "first file list follow"
[436] "state file download can nice seen"
[437] " rule associ templat state kc"
[438] "contain sever instanc follow path"
[439] ""
[440] "sysgetexetg14ex sysgetexems26ex"
[441] "sysgetexehi15ex sysgetexebe18ex"
[442] "sysgetexetw07ex sysgetexev2captchaex"
[443] "sysgetexev2googlecheckex"
[444] ""
[445] " inspect extract statemachin associ templat"
[446] " rule malwar analyst can gain insight inner work"
[447] " malwar instanc collect network trace alon"
[448] "render prisma valuabl tool beyond realm honeypot"
[449] "applic"
[450] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file006.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[4]]
$content
[1] " relat work"
[2] ""
[3] ""
[4] " first discuss valuabl extens "
[5] "practic deploy prisma model basic algorithm"
[6] "describ produc accur model"
[7] "reflect properti under pool administr"
[8] "might want tune model better honeypot perform"
[9] " compli privaci issu extens discuss"
[10] " thorough discuss relat"
[11] "work prisma framework given "
[12] ""
[13] " extens"
[14] ""
[15] ""
[16] "deploy prisma model honeypot add practic"
[17] "constraint hard algorithm fashion"
[18] "instanc contain inform violat"
[19] " privaci user system sinc clever intrud"
[20] "tri infer kind via communic "
[21] "honeypot use learn even contain"
[22] "password secur relat inform privaci issu even"
[23] "becom secur risk therefor devis extens"
[24] " prisma method allow administr explor learn"
[25] " adjust need part analysi framework"
[26] "bridg gap statist real world"
[27] "probabilist view transform back "
[28] "easili comprehend also function view outsid world"
[29] ""
[30] " model field"
[31] ""
[32] " rule generat process describ"
[33] " introduc rule "
[34] "fallback solut case find rule type"
[35] "associ specif field rule trigger"
[36] "select random valu seen train"
[37] "phase fill specif field certain valid"
[38] " also sound approach afflict potenti privaci"
[39] "problem correspond field contain privaci even secur"
[40] "relat content like usernam password one definit"
[41] "want leak kind inform therefor intellig"
[42] " pragmat solut need field preserv privaci"
[43] " secur issu"
[44] ""
[45] "one possibl solut problem infer perfield"
[46] "languag instanc token observ content"
[47] "contain rule ngram observ"
[48] "string sequenc ngram generat can"
[49] " leverag power markov chain "
[50] "observ pool probabilist way now fill"
[51] "field content just generat string accord"
[52] " learn markov string will resembl observ"
[53] "without direct return actual observ valu pool"
[54] " suffici divers kind will privaci"
[55] " secur preserv"
[56] ""
[57] ""
[58] ""
[59] "anoth issu deploy prisma honeypot real"
[60] "world craft correct complet "
[61] "servic also interest one potenti intrud might"
[62] " traffic use learn contain flaw"
[63] " allow normal valid input pass "
[64] "instanc learn horizon 2 just observ"
[65] "messag sequenc kind abe cde "
[66] " handl messag input ade cbe"
[67] "sinc observ behavior domain expert know"
[68] " sequenc messag also valid obvious solut"
[69] " problem just add miss edg markov"
[70] ""
[71] ""
[72] "unfortun markov now capabl handl"
[73] " unseen situat rule "
[74] " handl case obvious yet extrem"
[75] "power resort concept subrul instead just learn"
[76] " rule horizon k also learn rule horizon"
[77] "k1 k2 0 add len modul see"
[78] "algorithm automat pick rule "
[79] "highest possibl horizon carri simul instanc"
[80] " case learn rule transit de"
[81] " allow us go simul note"
[82] "transit horizon k0 consist sole"
[83] " rule allow generat messag even"
[84] "total unobserv transit"
[85] ""
[86] " explor"
[87] ""
[88] ""
[89] ""
[90] ""
[91] " explor allow play"
[92] " observ behavior accord under"
[93] " markov"
[94] ""
[95] ""
[96] ""
[97] " extract inform like content specif field "
[98] " react certain input administr need power"
[99] " explor tool simul communic show"
[100] " under graph node graph select"
[101] " associ templat rule inspect even edit"
[102] ""
[103] " extrem use reallif deploy program tool"
[104] " sophist useabl gui timeconsum"
[105] "task therefor decid develop proofofconcept"
[106] "explor termin shown figur user"
[107] "can play via command line interfac observ"
[108] "control flow prerend graph markov albeit"
[109] " simpl solut approach alreadi show good usabl"
[110] "possibl thus full fledg explor gui"
[111] "definit valuabl addit prisma framework"
[112] ""
[113] " templat match"
[114] ""
[115] "simul network traffic via len modul involv find"
[116] "right templat see line algorithm"
[117] " given input can solv just token"
[118] "given input accord learn find onetoon"
[119] "correspond base sequenc token possibl"
[120] "templat pool approach might strict realworld"
[121] "deploy henc intellig templat match algorithm"
[122] " valuabl extens len modul"
[123] ""
[124] "one way approach problem exploit edit distanc"
[125] " given token messag possibl templat find"
[126] " onetoon match can order possibl templat accord"
[127] " editdist input pick templat "
[128] "minim one exploit inform found distanc matrix"
[129] " dameraulevenshtein distanc two string match"
[130] "omit token found result optim align "
[131] "input templat"
[132] ""
[133] " approach"
[134] ""
[135] ""
[136] "infer messag format under protocol"
[137] "state machin relev problem today fast chang"
[138] "network world also tackl differ direct"
[139] " pure revers engin perspect open sourc"
[140] "communiti tri fulli understand inner work"
[141] "proprietari protocol order develop open implement"
[142] "sever network servic eg smb icq skype work"
[143] " done manual fashion special relev network"
[144] "protocol analysi secur field led mani research"
[145] "effort automat learn protocol state machin "
[146] "format messag involv valid communic session"
[147] ""
[148] " work constitut first attempt extract"
[149] " field protocol messag draw upon advanc"
[150] "comput techniqu approach propos cluster"
[151] "complet messag construct phylogenet tree order"
[152] " guid process global sequenc align "
[153] "needlemanwunsch algorithm roleplay"
[154] "author build idea tackl problem automat"
[155] "replay valid messag protocol although present"
[156] "limit approach requir side communic"
[157] "follow script use configur system"
[158] "alreadi consid problem simul state depend"
[159] "communic within scope replay"
[160] " system present propos enhanc solut beyond heurist"
[161] "introduc concept theorem prove weakest precondit"
[162] "verif mean handl protocol depend"
[163] ""
[164] " similar approach specif secur applic also"
[165] "focus replay valid messag introduc realm"
[166] "honeypot scriptgen leita2006 low"
[167] "interact honeypot learn simul communic pattern"
[168] "vulner object scriptgen infer"
[169] "accur specif protocol obtain maximum"
[170] "inform exploit attempt servic although close"
[171] "relat approach scriptgen design monitor"
[172] "lowlevel attack implement wherea prisma enabl"
[173] "collect track semant attack top "
[174] "implement similar strain research"
[175] "studi use token cluster individu messag"
[176] " find field messag structur howev work infer"
[177] " state machin protocol thus can use"
[178] "simul network communic"
[179] ""
[180] "differ approach base dynam taint analysi "
[181] "propos infer protocol specif"
[182] " lin2008 wondracek2008cui2008 order overcom lack"
[183] "semant cluster techniqu reli dynam binari"
[184] "analysi network servic handl protocol"
[185] "messag eas find keyword delimit unfortun"
[186] " work defer task learn protocol state"
[187] "machin extens work practic focus secur"
[188] " carri devis dispatch"
[189] "system capabl infiltr botnet whose oper may"
[190] "base custom proprietari protocol abl "
[191] " work messag side communic also"
[192] " host level worth mention work "
[193] " use binari analysi extract properti messag"
[194] "memori buffer alreadi decrypt"
[195] ""
[196] "final build idea order"
[197] "construct state machin protocol dynam"
[198] "behavior applic implement protocol extent"
[199] " work certain resembl nonetheless approach"
[200] "free addit burden binari taint analysi sinc "
[201] "fulli network base gather larg amount input trace"
[202] " system therebi straightforward task"
[203] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file007.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[5]]
$content
[1] " conclus"
[2] ""
[3] ""
[4] " prisma present tool capabl learn"
[5] "simul communic given servic network traffic"
[6] "alon repres intern state machin servic"
[7] " markov extract templat rule via align"
[8] "collect session state machin prisma abl extract"
[9] "inform necessari effici simul "
[10] "pool evalu show viewpoint"
[11] "complet syntact semant correct prisma"
[12] "capabl emul reallif network traffic"
[13] ""
[14] " next goal deploy prisma honeypot dedic network"
[15] "infrastructur evalu show prisma model"
[16] "solid expect valuabl input reallif applic check"
[17] " practic extens robustifi"
[18] "approach addit state fuzz interest"
[19] "applic prisma instanc one can use extract markov"
[20] " find communic path insid state machin"
[21] "occur seldom therefor tend rather untest"
[22] "errorpron believ templat structur rule can"
[23] "give valuabl clue field fuzz content"
[24] "maxim probabl enforc error analysi"
[25] "koobfac network trace prisma show method can readili"
[26] "appli domain malwar analysi still refin"
[27] " instanc find interest path state machin"
[28] " malwar can enhanc usabl prisma scenario"
[29] ""
[30] " term theoret find show cours"
[31] "dimension interfer model"
[32] "approach obvious preprocess featur select"
[33] " abl exploit fact resid much smaller"
[34] "subspac high dimension featur space henc elimin"
[35] "unnecessari redund featur can expos subspac even"
[36] " use matrix factor method like replicateawar"
[37] "nonneg matrix factor condens "
[38] " bare distributionallik minimum allow us subsequ step"
[39] " extract probabilist version abstract state machin"
[40] "layer approach preprocess follow featur reduct follow"
[41] " probabilist model also reflect softwar design"
[42] "prisma layer implement tool chain consist"
[43] "small special modul facilit debug"
[44] " modul also allow later reus part "
[45] "project"
[46] ""
[47] ""
[48] ""
[49] ""
[50] ""
[51] ""
[52] ""
[53] ""
[54] " squar error mse "
[55] " support vector regress model gaussian kernel repeat"
[56] " train 50 train point vari blue line"
[57] " denot mse train red line denot"
[58] " mse independ high"
[59] " complex ie low train error diverg even"
[60] " real error due overfit effect"
[61] ""
[62] ""
[63] ""
[64] ""
[65] " 225"
[66] " 125"
[67] " 025"
[68] " blue line real"
[69] " valu red line support vector regress model"
[70] " gaussian kernel train 50 train point black dot "
[71] " sinc see model high"
[72] " complex ie low left plot exhibit overfit model"
[73] " low complex ie high right plot clear"
[74] " underfit "
[75] ""
[76] ""
[77] ""
[78] "detect unwant behavior network infrastructur often involv"
[79] " rule base system monitor sensor observ"
[80] "aberr prespecifi rule rule often"
[81] "handcraft secur specialist analyz domain"
[82] "build meaning model instanc seen"
[83] "tabl specif malici request "
[84] "malwar content botz bottxt therefor"
[85] " domain expert build accord regular express "
[86] "intrus prevent system constant monitor network traffic"
[87] " search match regular express rule "
[88] "payload"
[89] ""
[90] "obvious solut involv lot man power "
[91] "method introduc previous chapter equip us power"
[92] "tool support effort even better "
[93] " rule automat given pool seen"
[94] "chapter structur featur extract"
[95] " joint similar metric open whole"
[96] "toolbox machin learn transfer suitabl"
[97] "vector space allow us pose rule learn problem"
[98] "regress classif task"
[99] ""
[100] " instanc regress problem want learn predict"
[101] "equat y fx consist point d1"
[102] "x1 y1 yn assum"
[103] " drawn iidfrom p one approach"
[104] " solv problem minim mean squar error mse"
[105] " given ie f denot estim "
[106] "function"
[107] ""
[108] " f d1 yi fxi2"
[109] ""
[110] "basic without prior knowledg under generat"
[111] "process infinit mani solut lead minim"
[112] " mse train one trivial method just"
[113] "memoiz valu train obvious"
[114] "lead poor perform unseen therefor one often impos"
[115] " structur restrict function class even introduc"
[116] " regular function can direct control"
[117] " socal paramet"
[118] " instanc ridg regress impos linear function class"
[119] "combin restrict result coeffici valu"
[120] "shown 7hastie09 assum y fx"
[121] " e 0 var"
[122] " expect predict error err "
[123] "regress fit fx input point x x use"
[124] "squarederror loss"
[125] ""
[126] " errx ey fx2 x x"
[127] ""
[128] " fx fx2bias2 fx"
[129] " fx e fx2var fx"
[130] ""
[131] ""
[132] " total error can broken irreduc part"
[133] "varianc squar bias estim far"
[134] " real solut varianc estim"
[135] "wigg estim meta paramet learn method"
[136] "control complex higher complex less"
[137] "bias higher varianc estim vice"
[138] "versa therefor choos complex high can"
[139] "perfect imit train bias tend zero will"
[140] "suffer higher varianc unseen will"
[141] "overfit contrari choos complex low"
[142] "estim will exhibit high bias low varianc lead "
[143] "underfit optim function class will lie"
[144] " bias varianc balanc lead"
[145] "overal smallest error estim learner"
[146] ""
[147] "figur show train error"
[148] "evolv support vector regress model"
[149] "gaussian kernel repeat train 50 train point vari"
[150] "width gaussian kernel fix regular"
[151] "paramet can observ train error "
[152] "complex model ie lower high underestim "
[153] "error independ furthermor can see "
[154] " optim perform rough around"
[155] " visual impress behavior "
[156] "differ complex class shown"
[157] "figur high complex "
[158] "figur overfit train low complex"
[159] " figur underfit real function optim"
[160] " figur accur describ real function"
[161] "class"
[162] ""
[163] "sinc reli train error determin"
[164] "optim complex therefor find tool estim"
[165] " error adjust complex via"
[166] " meta paramet learn method crossvalid "
[167] "defacto standard appli machin learn tune meta paramet"
[168] " machin learn method supervis learn set see"
[169] " stone74 geisser75 also"
[170] " recent extens review method part "
[171] "held back use get unbias estim"
[172] " true general error crossvalid comput"
[173] "quit demand though full grid search possibl"
[174] "combin paramet candid quick take lot time even"
[175] " one exploit obvious potenti parallel"
[176] ""
[177] "therefor crossvalid seldom execut full practic"
[178] " differ heurist usual employ speed "
[179] "comput exampl instead use full grid local search"
[180] "heurist may use find local minima error see"
[181] "instanc bergstra12 howev"
[182] " general local search method guarante can"
[183] "given qualiti found local minima anoth frequent"
[184] "use heurist perform crossvalid subset "
[185] " train full get accur"
[186] "predict problem find right size "
[187] "subset appar subset small reflect"
[188] "true complex learn problem paramet select"
[189] "crossvalid will lead underfit model hand"
[190] " larg subset will take longer crossvalid finish"
[191] ""
[192] "appli kind heurist requir experienc"
[193] "practition high familiar howev"
[194] "effect play subset approach manag will"
[195] "discuss depth given increas subset "
[196] "minim error will converg often much earlier "
[197] " error thus use subset systemat way open "
[198] "promis way speed select process sinc train"
[199] "model smaller subset much"
[200] "timeeffici process care taken "
[201] "increas avail sudden reveal structur "
[202] " lead chang optim paramet"
[203] "configur still will discuss depth "
[204] "way guard chang point make heurist"
[205] "take subset promis candid autom procedur"
[206] ""
[207] " chapter will propos method speed"
[208] "crossvalid consid subset increas size"
[209] "remov clear underperform paramet configur way"
[210] " lead substanti save total comput time sketch"
[211] " figur order account"
[212] "possibl chang point sequenti test adapt"
[213] " control zone rough speak certain number"
[214] " allow failur paramet configur time"
[215] " framework give statist guarante drop clear"
[216] "underperform configur final add stop criterion"
[217] " watch earli converg process speed "
[218] "comput"
[219] ""
[220] " follow will first discuss effect take subset"
[221] " learner crossvalid "
[222] "present method fast crossvalid via sequenti test cvst"
[223] " discuss theoret properti "
[224] "method final evalu method"
[225] "synthet realworld set"
[226] " give"
[227] "overview relat approach "
[228] "conclud chapter impati practition may skip"
[229] "theoret treatment focus selfcontain"
[230] " describ cvst algorithm "
[231] "evalu "
[232] ""
[233] ""
[234] ""
[235] ""
[236] ""
[237] ""
[238] ""
[239] ""
[240] " time consumpt 5fold crossvalid cv"
[241] " left fast crossvalid via sequenti test cvst"
[242] " right cv calcul configur"
[243] " full cvst algorithm use increas"
[244] " subset drop signific underperform"
[245] " configur step result drastic decreas"
[246] " total calcul time"
[247] ""
[248] ""
[249] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file008.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[6]]
$content
[1] " subset"
[2] ""
[3] ""
[4] " approach base take subset speed"
[5] "crossvalid main question therefor whether can"
[6] "reliabl estim best paramet configur alreadi"
[7] "subset mild condit will prove general"
[8] "converg estim also discuss can expect"
[9] "estim much better practic theoret result"
[10] "develop"
[11] ""
[12] " sake simplic will consid follow slight"
[13] "simplifi formal crossvalid procedur"
[14] "replac empir error expect risk "
[15] "therebi also remov repetit involv typic run"
[16] "kfold crossvalid way deal "
[17] "addit inaccuraci involv estim error "
[18] "finit make argument precis"
[19] " actual compar error enc defin"
[20] " equat anoth empir error"
[21] " emc defin independ sampl instead compar"
[22] " enc ec expect error paramet"
[23] " configur c technic mean "
[24] " consid enc emc enc ec emc"
[25] " ec get essenti result addit"
[26] " limit process m consid limit "
[27] " size also tend infin sake simplic"
[28] " drop detail"
[29] ""
[30] "assum train given inputoutput pair xi"
[31] "yi drawn iidfrom probabl distribut"
[32] "px y usual also assum "
[33] " loss function given "
[34] " overal error expect risk predictor g"
[35] " given rg e y x y px y"
[36] ""
[37] " possibl paramet configur c let gnc"
[38] " predictor learn paramet c c first n"
[39] "train exampl crossvalid basic tri identifi"
[40] "best paramet c given train size n minim"
[41] " expect risk normal done comput best"
[42] "paramet configur term empir risk holdout"
[43] " present discuss assum estim"
[44] " suffici accur expect risk"
[45] ""
[46] " now also consid differ train size "
[47] "effect deal task minim sequenc function"
[48] "e c defin"
[49] ""
[50] ""
[51] " enc rgnc"
[52] ""
[53] ""
[54] " thus interest minimum subset size k"
[55] "ek relat en question link "
[56] "asymptot behavior en minimum en"
[57] "converg will ek en eventu"
[58] ""
[59] "now general expect en converg "
[60] "exampl might paramet encod way"
[61] " need scale sampl size ie might"
[62] " valu c sampl size n correspond fnc "
[63] "function fn complex set condit"
[64] "paramet configur c might work well subset size k"
[65] "becom suboptim choic larger sampl size"
[66] ""
[67] " will therefor assum fix c"
[68] "enc exist use standard techniqu straightforward"
[69] " prove follow result see appendix"
[70] " proof"
[71] ""
[72] ""
[73] ""
[74] " let c finit assum fix c enc"
[75] " ec probabl follow hold"
[76] ""
[77] " converg c n"
[78] ""
[79] " cenc ec 0"
[80] ""
[81] " probabl"
[82] " minimum let cn "
[83] " encn c enc c ec"
[84] " c ec"
[85] ""
[86] " ecn ec 2 c enc ec"
[87] ""
[88] " moreov"
[89] ""
[90] " ecn ec 0 probabl"
[91] ""
[92] " subset "
[93] " 0 exist number n n n k"
[94] " n k n"
[95] ""
[96] " p encn"
[97] ""
[98] ""
[99] ""
[100] ""
[101] " sake simplic assum c finit"
[102] "result can like extend continu paramet space"
[103] "signific technic overhead"
[104] ""
[105] "theorem basic prove asymptot"
[106] "can expect get good estim right choic paramet"
[107] "train size n subset k result assum paramet"
[108] "configur encod way independ size"
[109] " train hing uniform converg "
[110] "possibl paramet choic"
[111] ""
[112] "now well result describ practic find"
[113] "figur show error typic"
[114] "exampl train support vector regress svr subset "
[115] "full train consist 500 point "
[116] " sinc introduc "
[117] "paramet kernel width gaussian kernel use"
[118] " regular paramet valu shown alreadi"
[119] "optim regular paramet sake"
[120] "simplic"
[121] ""
[122] " see minimum converg rather quick first plateau"
[123] " 15 03 approxim toward"
[124] "lower one 25 17 also optim one train"
[125] " size n 500 see uniform converg main"
[126] "drive forc fact error small kernel width still"
[127] " far apart even minimum alreadi converg"
[128] ""
[129] " follow help continu discuss within"
[130] "empir risk minim framework assum learner"
[131] "train pick minim empir risk"
[132] " hypothesi h set one can write differ"
[133] " expect risk learn predictor rgn "
[134] "bay risk r follow see also section121"
[135] " section243"
[136] ""
[137] " rgn r"
[138] ""
[139] " error"
[140] " r"
[141] " error"
[142] ""
[143] " estim error measur far chosen one"
[144] " asymptot optim approxim error"
[145] "measur differ best possibl "
[146] "hypothesi class true function"
[147] ""
[148] ""
[149] ""
[150] ""
[151] ""
[152] " error svr"
[153] " sinc introduc"
[154] " can observ shift optim"
[155] " gaussian kernel finegrain structur "
[156] " problem seen enough figur b approxim"
[157] " error indic black solid line estim error"
[158] " black dash line asymptot"
[159] " approxim error plot blue dash line one can see"
[160] " uniform approxim estim error main"
[161] " drive forc instead decay approxim error"
[162] " smaller kernel width togeth increas estim"
[163] " error small kernel width make sure minimum converg quick"
[164] ""
[165] ""
[166] "use decomposit can interpret figur follow see"
[167] "figur kernel width"
[168] "basic control approxim error "
[169] "18 result hypothesi class coars repres"
[170] "function consider becom smaller reach"
[171] "level bay risk indic dash blue"
[172] "line even larger train size can assum will"
[173] "stay level even smaller kernel size"
[174] ""
[175] " differ blue line upper line show"
[176] "estim error estim error extens studi"
[177] " statist learn theori known link differ"
[178] "notion complex like vcdimens"
[179] "fatshatt dimens norm "
[180] "reproduc kernel hilbert space rkhs"
[181] "typic result show estim error can bound term"
[182] " form"
[183] ""
[184] " rgn rn o nn"
[185] ""
[186] " dh notion complex under hypothesi"
[187] "class figur mean can expect estim"
[188] "error becom larger smaller kernel width"
[189] ""
[190] " basic order paramet configur accord"
[191] " complex make three observ"
[192] ""
[193] " paramet small complex larg kernel"
[194] " width approxim error will high estim"
[195] " error will small"
[196] " paramet high complex approxim error"
[197] " will small even optim estim error will larg"
[198] " also see figur"
[199] " approxim error seem decreas faster increas"
[200] " complex estim error increas"
[201] ""
[202] " combin estim smaller train size tend"
[203] "underestim true complex approxim"
[204] "error quick decreas minimum also converg true"
[205] "one fact estim error larger complex"
[206] "model act guard choos complex model"
[207] ""
[208] "unfortun exist theoret result abl bound"
[209] "error suffici tight make argument"
[210] "exact particular speed converg minimum"
[211] "hing tight lower bound approxim error "
[212] "realist upper bound estim error approxim error"
[213] " studi exampl paper "
[214] " paper prove upper bound "
[215] "rate also worstcas rate like close enough"
[216] " true error hand mechan lead fast"
[217] "converg minimum plausibl look concret"
[218] "exampl therefor will assum follow"
[219] " locat best paramet configur might initi"
[220] "chang becom less stabl quick will use"
[221] "sequenti test introduc zone ensur"
[222] " method robust initi chang"
[223] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file009.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[7]]
$content
[1] " valid via sequenti test cvst"
[2] ""
[3] ""
[4] " consid usual supervis learn set "
[5] "consist point d1 x1 y1 yn"
[6] " assum drawn iidfrom"
[7] "px y learn algorithm"
[8] " depend sever paramet c goal select"
[9] "paramet c learn predictor g best"
[10] "general error respect loss function"
[11] " full kfold"
[12] "crossvalid estim best paramet split "
[13] " k part use k1 part train estim"
[14] "error remain part"
[15] ""
[16] " approach attempt speed process take subsampl"
[17] "size sn 1 s start "
[18] "full paramet candid elimin clear"
[19] "underperform candid step execut main"
[20] "loop algorithm page"
[21] "perform follow main part given subset "
[22] ""
[23] ""
[24] " procedur transform pointwis error "
[25] " remain configur binari top flop scheme"
[26] " line"
[27] " drop signific loser configur along way"
[28] " line use test sequenti analysi"
[29] " framework"
[30] " appli robust distribut free test techniqu allow"
[31] " earli stop procedur seen enough"
[32] " stabl paramet estim line"
[33] ""
[34] ""
[35] " follow will discuss individu step "
[36] "algorithm conceptu overview one iter procedur"
[37] "depict figur refer"
[38] ""
[39] " step cvst shown situat"
[40] " step s 10 point"
[41] " learn configur c1 ck error"
[42] " calcul current d1 dn"
[43] " transform binari perform indic trace"
[44] " configur filter via sequenti analysi ck1"
[45] " ck drop procedur check whether"
[46] " remain configur perform equal well past stop"
[47] " case see appendix "
[48] " complet exampl run"
[49] ""
[50] " transform error"
[51] ""
[52] " robust transform perform configur "
[53] "binari inform whether among topperform"
[54] "configur turn flop reli distributionfre"
[55] "test basic idea calcul pointwis perform "
[56] "given configur point use modelbuild"
[57] "process group togeth best perform configur show"
[58] "similar behavior"
[59] ""
[60] "suppos situat depict figur"
[61] " k remain configur c1 c2 ck"
[62] " order accord mean perform now want find"
[63] " smallest index t configur c1 c2"
[64] " ct show behavior point d1 d2"
[65] " dn main idea robust test rank"
[66] "configur ci 1 t accord individu"
[67] "perform jth point result rank ri"
[68] " j h0 configur act similar sum "
[69] "rank point st rt "
[70] "similar t 1 t case regress task"
[71] "individu perform measur realvalu residu "
[72] "procedur implement sketch"
[73] " classif task observ binari perform "
[74] "procedur can appli instead see"
[75] "appendix summari test find"
[76] "smallest index t appli increas subset "
[77] "configur point t1 show signific"
[78] "effect given signific level result flag"
[79] " configur 1 t top configur "
[80] "remain t1 k configur flop configur note"
[81] " increment procedur multipl test situat"
[82] "sinc interest joint infer hypothes"
[83] " use individu decid whether can observ"
[84] "signific effect due addit new configur"
[85] ""
[86] " actual calcul error function"
[87] " calcperform appli increment build process"
[88] "ie ad step line increas"
[89] "train pool allow onlinealgorithm adapt"
[90] " also increment error calcul "
[91] "remain point avail crossvalid"
[92] " result first step collect trace matrix see"
[93] "figur top right show gradual"
[94] "transform last 10 step procedur highlight"
[95] "result last"
[96] ""
[97] " signific loser"
[98] ""
[99] ""
[100] " transform error scaleindepend top flop"
[101] "scheme can now whether given paramet configur "
[102] "overal loser sequenti test binari random variabl"
[103] "address analysi framework develop"
[104] " origin appli context"
[105] "product qualiti assess compar two product process"
[106] "biolog set stop bioassay soon gather"
[107] "lead signific result"
[108] ""
[109] " main idea follow one observ sequenc"
[110] "iidbinari random variabl z1 z2 want"
[111] " whether variabl distribut accord h0"
[112] " h1 signific level "
[113] "accept h1 h0 can control via"
[114] "metaparamet comput"
[115] "likelihood far observ reject one "
[116] "hypothesi respect likelihood ratio larger "
[117] "factor control metaparamet can shown "
[118] "procedur intuit geometr represent depict"
[119] "figur lower left binari observ"
[120] "record cumul sum time step sum exceed"
[121] "upper red line accept h1 sum lower red"
[122] "line accept h0 sum stay two red line"
[123] " draw anoth sampl"
[124] ""
[125] "wald requir choos sinc"
[126] " main goal use sequenti elimin"
[127] "underperform choos paramet "
[128] " h1 configur win postpon long"
[129] "possibl will allow cvst algorithm keep configur"
[130] " evid perform definit show "
[131] " overal loser configur time want"
[132] "maxim area configur elimin region denot"
[133] " loser fig reject mani loser"
[134] "configur way possibl"
[135] ""
[136] ""
[137] " sa 1"
[138] ""
[139] ""
[140] " defin "
[141] "triangular area accept h0 sa"
[142] " earliest step accept h1 variabl"
[143] " defin total number step"
[144] ""
[145] "use result global optim"
[146] "equat can solv follow"
[147] ""
[148] " 05"
[149] " 10"
[150] ""
[151] ""
[152] " averag sampl number"
[153] " expect number step given will yield"
[154] "decis real 10 note sequenti analysi"
[155] "formal requir iidvari might true"
[156] "configur transform winner configur later"
[157] "therebi chang behavior flop "
[158] "topconfigur therefor tune procedur use"
[159] "sequenti analysi framework just decis whether"
[160] "configur overal loser adjust"
[161] " switch role keep potenti configur long"
[162] "possibl just drop trace statist signific"
[163] "correspond binomi 05 detail "
[164] "open sequenti analysi pleas consult see"
[165] "instanc general overview sequenti"
[166] "test procedur"
[167] ""
[168] ""
[169] ""
[170] " main loop"
[171] ""
[172] ""
[173] " conf"
[174] " gettest"
[175] " tracesc s performancec s 0"
[176] " activec true"
[177] " 1 "
[178] " pointwiseperf"
[179] " calcperformancedata activ"
[180] " performanceact s averageperformancepointwiseperf"
[181] " trace topconfspointwiseperf s 1"
[182] " activ flopconfstest tracesact 1s fals"
[183] " similarperformancetracesact s"
[184] " break"
[185] ""
[186] ""
[187] ""
[188] " selectwinnnerperform activ"
[189] ""
[190] ""
[191] ""
[192] ""
[193] " stop final winner"
[194] ""
[195] ""
[196] "final employ earli stop rule line"
[197] "take last column trace matrix"
[198] "check whether remain configur perform equal well"
[199] " past use cochran q "
[200] " similarperform procedur figur"
[201] "illustr complet run cvst algorithm rough 600"
[202] "configur configur mark red correspond "
[203] "flop configur black one top"
[204] "configur final configur mark gray "
[205] "drop via sequenti cvst algorithm small"
[206] "zoomin lower part pictur show last"
[207] " remain configur step "
[208] "use earli stop criterion can see procedur"
[209] "keep go heterogen behavior remain"
[210] "configur mix redblack configur perform"
[211] "equal well past near black earli stop "
[212] " see signific effect anymor procedur stop"
[213] ""
[214] " upper plot show run "
[215] " cvst algorithm rough 600 configur step"
[216] " configur mark top black flop red drop"
[217] " gray blowup show situat step 5 7 without"
[218] " drop entri earli stop rule take affect step 7"
[219] " remain configur perform equal well step 5 7"
[220] ""
[221] "final procedur selectwinn line"
[222] " win configur pick configur "
[223] "surviv step follow remain configur"
[224] "determin rank step accord averag perform"
[225] " step averag rank last"
[226] " step pick configur lowest"
[227] "mean rank way make use accumul"
[228] " cours procedur restrict view last"
[229] " observ also take account "
[230] "optim paramet might chang increas size sinc"
[231] "focus recent observ biggest model"
[232] "alway pick configur suitabl "
[233] "size hand"
[234] ""
[235] " cvst"
[236] ""
[237] ""
[238] " cvst algorithm number metaparamet "
[239] "experiment determin beforehand give"
[240] "suggest choos paramet paramet"
[241] "control signific level step similar"
[242] "behavior suggest usual level "
[243] "005 furthermor control signific"
[244] "level h0 configur loser h1 configur"
[245] " winner respect suggest asymmetr setup set"
[246] " 01 sinc want drop loser configur relat"
[247] "fast 001 sinc want realli sure "
[248] "accept configur overal winner final "
[249] " 3 6 20 "
[250] " observ choic work well practic"
[251] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file010.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[8]]
$content
[1] " properti cvst algorithm"
[2] ""
[3] ""
[4] " introduc overal concept cvst algorithm"
[5] "now focus theoret properti ensur proper"
[6] "work procedur exploit guarante under"
[7] "sequenti test framework show experiment can"
[8] "control procedur work stabl regim furthermor"
[9] "prove error bound cvst algorithm addit show"
[10] " cvst algorithm can use work best given time budget"
[11] ""
[12] " bound stabl regim"
[13] ""
[14] ""
[15] " discuss perform "
[16] "configur might chang feed learn algorithm"
[17] " therefor reason algorithm exploit learn"
[18] "subset must capabl deal difficulti"
[19] " potenti chang point behavior certain"
[20] "configur investig theoret"
[21] "properti cvst algorithm make particular suitabl"
[22] " learn increas subset "
[23] ""
[24] " first properti open sequenti employ cvst"
[25] "algorithm come handi control overal converg process"
[26] " assur configur drop prematur"
[27] ""
[28] " zone given cvst"
[29] " algorithm signific level "
[30] " top flop configur respect maxim number step"
[31] " s global win configur loos first"
[32] " cp iter long "
[33] ""
[34] " 0 sz "
[35] " 2"
[36] ""
[37] " probabl configur drop cvst"
[38] "algorithm zero"
[39] ""
[40] ""
[41] ""
[42] " detail proof defer appendix"
[43] ""
[44] ""
[45] " consequ lemma "
[46] "experiment can direct control via signific level"
[47] " iter prematur drop"
[48] " occur therefor guid whole process stabl"
[49] "regim configur will see enough show"
[50] "real perform"
[51] ""
[52] "equip properti can now take thorough look "
[53] "worst case perform cvst algorithm suppos global win"
[54] "configur constant mark loser secur"
[55] "zone amount avail point "
[56] "suffici show superior configur given"
[57] " global win configur now see enough mark"
[58] " win configur binar process throughout"
[59] "next step probabl can give exact error bound"
[60] " overal process solv specif recurr"
[61] ""
[62] " worstcas"
[63] " scenario error probabl cvst algorithm global"
[64] " winner configur label constant loser "
[65] " secur zone reach can calcul probabl"
[66] " configur endur sequenti recurr"
[67] " scheme count number remain path end "
[68] " nonlos regiontrim30pt 40pt 30pt 40ptcliptru"
[69] ""
[70] "figur give visual impress "
[71] "worst case analysi exampl 20 step cvst execut"
[72] " win configur generat straight line zero "
[73] "secur zone 7 approach bound error fast"
[74] "crossvalid now consist essenti calcul probabl"
[75] "mass end nonlos region"
[76] " follow lemma show can express number path"
[77] " lead specif point graph twodimension"
[78] "recurr relat"
[79] ""
[80] " relat"
[81] " denot pr c number path lead point"
[82] " rc lie lower decis boundari l0 "
[83] " sequenti given worst case scenario describ "
[84] " number path can calcul follow"
[85] ""
[86] "pr c"
[87] ""
[88] " 1 r0 c sz"
[89] " 1 r c sz 1"
[90] " pr c1 pr1 c1 l0c r c sz 1"
[91] " 0"
[92] ""
[93] ""
[94] ""
[95] ""
[96] ""
[97] " split proof 4 case"
[98] ""
[99] " first case definit configur "
[100] " straight line zero secur zone sz"
[101] " second case describ diagon path start point"
[102] " 1 sz 1 construct path 1 mean diagon 0"
[103] " mean one step right diagon path can just reach"
[104] " singl combin name straight line one"
[105] " third case actual recurr given point"
[106] " lower decis bound l0 number path"
[107] " lead point equal number path lie"
[108] " direct left point plus path lie direct"
[109] " diagon downward point first path"
[110] " point can reach direct step right "
[111] " latter current point can reach diagon step"
[112] " upward sinc option "
[113] " construct equal hold"
[114] " last case describ path either lie"
[115] " lower decis bound therefor end loser region"
[116] " diagon can thus never reach"
[117] ""
[118] ""
[119] ""
[120] " recurr visual figur"
[121] "number grid give number valid nonlos path"
[122] "can reach specif point recurr now abl"
[123] "prove global worstcas error probabl fast"
[124] "crossvalid"
[125] ""
[126] " bound cvst suppos global"
[127] " win configur reach secur zone constant"
[128] " loser trace switch winner configur "
[129] " success probabl error cvst"
[130] " algorithm erron drop configur can determin"
[131] " follow"
[132] ""
[133] " p 1 l0 1r"
[134] " pi"
[135] " r"
[136] ""
[137] ""
[138] ""
[139] " basic idea use number path lead nonlos"
[140] "region calcul probabl configur actual"
[141] "surviv correspond last column exampl"
[142] "figur sinc outcom "
[143] "binar process binomi variabl success"
[144] "probabl first diagon path probabl"
[145] " next path probabl"
[146] " last viabl path"
[147] "reach point l0 1 "
[148] "complet probabl surviv configur sum"
[149] " correspond number path lemma sinc "
[150] "interest complementari event subtract result sum"
[151] " one conclud proof"
[152] ""
[153] ""
[154] "note earli stop rule interfer bound"
[155] " worst case inde process goe maxim"
[156] "number step sinc probabl mass will"
[157] "maxim spread due linear lower decis boundari "
[158] "correspond expon maxim earli stop rule"
[159] "termin process reach maximum number step"
[160] " result error probabl will lower given bound"
[161] ""
[162] " bound fast"
[163] " crossvalid proven theorem differ"
[164] " success probabl maxim step size mark"
[165] " global trend fit loess curv given dot line "
[166] " datatrim10pt 12pt 10pt 12ptcliptru"
[167] ""
[168] " error bound differ success probabl propos"
[169] "sequenti depict"
[170] "figur first can observ relat"
[171] "fast converg overal error increas maxim number"
[172] " step impact error margin shown"
[173] "success probabl iefor instanc 095 error"
[174] "near converg optimum 005 therefor chosen"
[175] "scheme allow us control secur zone also"
[176] " small impact overal error probabl "
[177] "show practic sequenti ratio fast"
[178] "crossvalid procedur use statist test can"
[179] "balanc need conserv retent configur"
[180] "long possibl statist control drop"
[181] "signific loser configur near impact overal"
[182] "error probabl analysi assum experiment"
[183] "chosen right secur zone learn problem hand"
[184] "small size happen secur zone chosen"
[185] " small therefor chang point global win"
[186] "configur might lie outsid secur zone will"
[187] "occur often today size set analyz behavior"
[188] "cvst circumst appendix give"
[189] "complet view properti algorithm"
[190] ""
[191] " valid time budget"
[192] ""
[193] " time"
[194] " consumpt cubic learner step calcul"
[195] " subset calcul time tf"
[196] " full adjust accord s"
[197] " step process assum drop r k remain"
[198] " configur"
[199] ""
[200] " cvst algorithm can use box speed"
[201] "regular crossvalid aforement properti "
[202] "procedur come handi face situat optim"
[203] "paramet configur found given fix comput"
[204] "budget time suffici perform full"
[205] "crossvalid amount process "
[206] "big explor suffici space paramet grid ordinari"
[207] "crossvalid reason time cvst algorithm allow"
[208] "get select inform given"
[209] "specifi constraint"
[210] ""
[211] "basic achiev calcul maxim step paramet"
[212] " lead near coverag avail time budget t"
[213] "depict figur given k paramet"
[214] "configur prespecifi secur zone bound s"
[215] " 0 s 1 ensur configur"
[216] "drop prematur comput demand cvst algorithm"
[217] " approxim sum time need step"
[218] "involv calcul k configur "
[219] "step r k configur 0 r 1"
[220] " will see experiment evalu assumpt"
[221] " given drop rate 1r lead form time consumpt"
[222] " depict figur quit"
[223] "common observ drop rate correspond overal difficulti"
[224] " problem hand"
[225] ""
[226] "given comput time tf need perform calcul"
[227] " full prove appendix"
[228] " optim maximum step paramet cubic learner can calcul"
[229] "follow"
[230] ""
[231] " tf k1rs3 tf r k1rs4rtf"
[232] " k k1rs3tf r k 2t1rs4rtf"
[233] " k"
[234] ""
[235] " calcul maxim number step given time"
[236] "budget t can use result lemma"
[237] "determin maxim given fix yield"
[238] " request secur zone bound"
[239] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file011.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[9]]
$content
[1] ""
[2] ""
[3] ""
[4] " evalu cvst algorithm real investig"
[5] "perform control set regress"
[6] "classif task introduc special tailor set"
[7] "highlight overal behavior stress fast"
[8] "crossvalid procedur evalu choic learn"
[9] "method influenc perform cvst algorithm compar"
[10] "kernel logist regress klr vector"
[11] "machin svm classif problem kernel ridg regress"
[12] "krr versus regress problem use gaussian"
[13] "kernel scholkopf2000 experi"
[14] "use 10 step cvst paramet set"
[15] "describ give us upper bound "
[16] "expect speed gain note get even higher speed gain"
[17] " either lower number step increas "
[18] "practic point view believ set studi"
[19] "high realist"
[20] ""
[21] " set"
[22] ""
[23] " assess qualiti cvst algorithm first examin"
[24] "behavior control set seen motiv"
[25] " specif learn problem might sever layer"
[26] "structur can reveal learner enough "
[27] "avail instanc figur can see"
[28] " first optim plateau occur real"
[29] "optim paramet center around thus real optim"
[30] "choic just becom appar seen 200"
[31] "point"
[32] ""
[33] " construct learn problem regress"
[34] " classif task pose sever problem cvst"
[35] "algorithm stop earli will return suboptim"
[36] "paramet evalu differ intrins dimension"
[37] " various nois level affect perform "
[38] "procedur classif task use sine"
[39] " consist sine uniform sampl rang"
[40] "control intrins dimension d"
[41] ""
[42] "y"
[43] " n2 x 0 2 d"
[44] " n 05 d 50 100"
[45] " label sampl point just sign y"
[46] "regress task devis sinc "
[47] "consist sinc function overlay highfrequ sine"
[48] "y x d x5"
[49] " n2 x"
[50] " n 02 d 3 4"
[51] ""
[52] " set generat 1000 point run 10"
[53] "step cvst compar result normal"
[54] "10fold crossvalid full record "
[55] "error addit 10000 point time consum "
[56] "paramet search explor paramet grid contain 610 equal"
[57] "space paramet configur method"
[58] " 29 3 01 05"
[59] " svmsvr 6 2"
[60] "klrkrr respect process repeat 50 time gather"
[61] "suffici interpret overal process"
[62] ""
[63] ""
[64] " mean squar error left relat speed gain right "
[65] " sine "
[66] ""
[67] " result sine can seen"
[68] "figur upper boxplot show"
[69] "distribut differ mean squar error best"
[70] "paramet determin cvst normal crossvalid low"
[71] "nois set n 025 cvst algorithm find optim"
[72] "paramet normal crossvalid intrins"
[73] "dimension d50 d100 cvst algorithm get stuck"
[74] " suboptim paramet configur yield increas"
[75] "classif error compar normal crossvalid"
[76] "tendenc slight increas high nois set n 05"
[77] "yield broader distribut classif method use seem"
[78] " direct influenc differ svm klr show"
[79] "near similar behavior pictur chang look "
[80] "speed gain svm near alway rang 15 19"
[81] "klr show speedup 20 60 time varianc "
[82] "speed gain general higher compar svm seem "
[83] "direct consequ inner work klr basic main"
[84] "loop perform step matrix invers whole kernel"
[85] "matrix calcul coeffici converg obvious"
[86] "converg criterion lead relat widespread distribut"
[87] " speed gain compar svm perform"
[88] ""
[89] " configur"
[90] " step sine "
[91] ""
[92] "figur show distribut "
[93] "number remain configur step cvst"
[94] "algorithm low nois set upper row can observ"
[95] "tendenc bigger drop rate d100 high nois"
[96] "set lower row observ steadi increas kept"
[97] "configur combin higher spread "
[98] "distribut overal see effect drop rate"
[99] "configur set svm klr show near similar"
[100] "behavior higher speed gain klr seen "
[101] "direct consequ algorithm influenc"
[102] " cvst algorithm"
[103] ""
[104] ""
[105] " mean squar error left plot relat speed gain right"
[106] " plot sinc "
[107] ""
[108] " perform sinc shown"
[109] "figur first strike"
[110] "observ transit cvst algorithm can"
[111] "observ intrins dimension d4 point"
[112] "overal excel perform cvst algorithm verg"
[113] "choos suboptim paramet configur behavior "
[114] "evid high nois set svr near alway show"
[115] "smaller differ krr capabl delay declin"
[116] " high nois set least part speed gain observ"
[117] "near constant differ dimension nois level"
[118] " rang 15 20 svr 60 80 krr"
[119] ""
[120] " configur"
[121] " step sinc "
[122] ""
[123] " direct consequ behavior can observ"
[124] " number remain configur shown"
[125] "figur compar classif"
[126] "experi drop much drastic intrins"
[127] "dimension nois level show small influenc higher"
[128] "dimension nois level yield remain configur"
[129] " overal varianc distribut much smaller "
[130] "classif experi"
[131] ""
[132] ""
[133] " mean squar error svmsvr increas size"
[134] " sine left sinc right set ad"
[135] " cvst algorithm converg correct paramet"
[136] " configur"
[137] ""
[138] " figur examin"
[139] "influenc perform cvst algorithm"
[140] " sine sinc abl"
[141] " estim correct paramet configur nois"
[142] "dimension set feed cvst enough"
[143] " limit experi "
[144] " svmsvr method sinc full crossvalid klrkrr"
[145] " taken much time comput clear cvst capabl"
[146] "extract right paramet configur increas"
[147] "amount 2000 5000 point render method"
[148] "even suitabl big scenario abund cvst"
[149] "will abl estim correct paramet much smaller time"
[150] "frame"
[151] ""
[152] " set"
[153] ""
[154] " demonstr overal perform cvst algorithm"
[155] "control set will investig perform real life"
[156] " well known benchmark set classif pick"
[157] "repres choic set ida benchmark repositori"
[158] "see"
[159] "httpwwwmldataorg furthermor ad first two class"
[160] " entri "
[161] " follow procedur"
[162] " paper sampl 2000 point class "
[163] "learn estim error remain point"
[164] "regress pick use add"
[165] " delv"
[166] "repositori httpwwwcstorontoedu"
[167] ""
[168] " process follow first scale "
[169] "case regress also scale depend variabl split"
[170] " half use one part train "
[171] " estim error process repeat 50 time"
[172] "get suffici statist perform method "
[173] " artifici set compar differ error"
[174] " speed gain fast normal crossvalid "
[175] "paramet grid 610 valu "
[176] " adjust"
[177] "rang 59 0"
[178] " adjust small structur found set"
[179] ""
[180] ""
[181] " mean squar error upper plot relat speed gain lower"
[182] " plot set"
[183] ""
[184] "figur show result "
[185] "classif set left side regress set"
[186] "right side upper panel depict differ mean squar"
[187] "error mse classif task differ never"
[188] "exceed two percent point show although fast"
[189] "crossvalid procedur case seem pick suboptim"
[190] "paramet impact fals decis small "
[191] "hold true regress task sinc depend variabl"
[192] " problem standard differ mse valu"
[193] "compar observ classif task see"
[194] "just small differ mse although problem"
[195] "cvst algorithm pick suboptim paramet even "
[196] "differ error alway relat small learner"
[197] "hard impact behavior just "
[198] " see signific differ "
[199] "correspond method term speed gain see much"
[200] "divers vari pictur overal speed improv klr"
[201] " krr higher svm svr reach 120 time"
[202] "compar normal crossvalid regress task general seem"
[203] " solv faster classif task can clear"
[204] "explain look trace"
[205] "figur classif task number"
[206] " kept configur general much higher "
[207] "regress task furthermor can observ sever type"
[208] "difficulti learn problem instanc "
[209] " seem much difficult "
[210] " also reflect differ"
[211] "speed improv seen previous figur"
[212] ""
[213] ""
[214] " configur step set"
[215] ""
[216] " summari evalu benchmark set show "
[217] "cvst algorithm give huge speed improv compar normal"
[218] "crossvalid see nonoptim choic"
[219] "configur total impact error never except"
[220] "high keep mind chosen paramet"
[221] " cvst algorithm give impress maxim attain"
[222] "speedup conserv set trade comput time"
[223] "lower impact error"
[224] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file012.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[10]]
$content
[1] " relat work"
[2] ""
[3] ""
[4] "sequenti test use extens machin learn"
[5] "context summar work relat"
[6] " cvst algorithm discuss variant"
[7] " sequenti evalu perform cvst"
[8] "algorithm shown socal sequenti"
[9] " lack essenti properti open variant wald use"
[10] " cvst algorithm underlin optim "
[11] "open wald learn increas subset "
[12] ""
[13] " test machin learn"
[14] ""
[15] "use statist test sequenti analysi framework order"
[16] " speed learn topic sever line"
[17] "research howev exist bodi work most focus"
[18] "reduc number evalu focus overal"
[19] "process elimin candid best "
[20] "knowledg new concept can appar combin"
[21] " alreadi avail race techniqu reduc total"
[22] "calcul time"
[23] ""
[24] " introduc socal"
[25] " race base nonparametr hoeffd bound "
[26] "mean error step algorithm new point"
[27] " evalu remain model confid interv"
[28] " error updat accord model whose confid"
[29] "interv error lie outsid least one interv "
[30] "better perform drop devis"
[31] " similar rang algorithm use concept pac learn game"
[32] "theori differ hypothes order expect util"
[33] "accord algorithm seen far "
[34] "hoeffd race emphasi approach lie reduc"
[35] "number evalu"
[36] ""
[37] " concept race extend "
[38] "introduc upper bound learner loss function "
[39] "exampl procedur allow earli stop learn"
[40] "process loss near optim infinit"
[41] " appli race domain evolutionari"
[42] "algorithm extend framework use friedman "
[43] "filter nonpromis configur use"
[44] "similar concept context boost filterboost"
[45] " introduc empir bernstein bound extend"
[46] " filterboost framework race algorithm "
[47] "case bound use estim error within specif"
[48] " region given probabl use"
[49] "concept sequenti test speed boost process"
[50] "control number featur evalu "
[51] "sampl similar fashion approach use"
[52] " increas speed evalu "
[53] "perceptron speed pegaso"
[54] "algorithm use partial leaveoneout evalu"
[55] " perform get estim overal perform"
[56] " use pick probabl best race"
[57] "concept appli wide varieti domain like reinforc"
[58] "learn timet"
[59] " show relev pratic impact topic"
[60] ""
[61] " first sight multiarm bandit problem"
[62] " also seem relat problem"
[63] " anoth way multiarm bandit problem number"
[64] "distribut given task identifi distribut"
[65] " largest mean chosen sequenc sampl "
[66] "individu distribut round agent choos one"
[67] "distribut sampl typic find balanc"
[68] " explor differ distribut reject distribut"
[69] " seem promis focus candid get"
[70] " accur sampl"
[71] ""
[72] " look similar set also wish identifi"
[73] "promis candid reject underperform configur"
[74] "earli process main differ "
[75] "multiarm bandit set assum distribut fix"
[76] "wherea specif deal distribut chang"
[77] " sampl size increas lead introduct "
[78] "safeti zone among thing therefor multiarm bandit"
[79] "set applic across differ sampl size"
[80] ""
[81] " hand multiarm bandit approach possibl"
[82] "extens speed comput within fix train size"
[83] "either test similar hoeffd race alreadi mention"
[84] " elimin comput individu fold "
[85] "crossvalid procedur underperform configur"
[86] ""
[87] " versus close sequenti test"
[88] ""
[89] " alreadi introduc sequenti"
[90] "test pioneer monitor likelihood"
[91] "ratio"
[92] ""
[93] ""
[94] " hi x 1"
[95] ""
[96] ""
[97] "hypothes h1 accept contrari h0"
[98] "accept b neither condit appli"
[99] "procedur accept either two hypothes need"
[100] " b chosen error probabl "
[101] "two decis exceed "
[102] "respect proven open sequenti"
[103] "probabl ratio wald optim sens compar"
[104] " test power requir averag fewest"
[105] "observ decis test scheme wald call"
[106] " sinc procedur potenti go forev"
[107] "long leav abtunnel"
[108] ""
[109] ""
[110] " open design wald procedur led develop "
[111] "differ kind sequenti test number observ"
[112] " fix beforehand"
[113] " instanc"
[114] " clinic studi might imposs ethic prohibit"
[115] "use potenti go forev unfortun none"
[116] " socal close exhibit optim criterion"
[117] "therefor choos one least simul studi show"
[118] " best behavior term averag sampl number statist"
[119] "method base gambler ruin scenario"
[120] " player fix fortun decid play n"
[121] "game fn fa fb probabl player"
[122] "fortun fa stake b will ruin oppon fortun fb"
[123] " exact n game follow recurr hold"
[124] ""
[125] "fn fa fb"
[126] ""
[127] " 0 fa 0 n 0 fb 0"
[128] " 1 n 0 fa 0 fb 0"
[129] " fn1 fa1 fbb"
[130] " 1 fn1 fab fbb"
[131] ""
[132] ""
[133] " step player can either win game probabl"
[134] "win 1 oppon lose stake b "
[135] "player now given n x y game player won y"
[136] " player b won x game will stop either "
[137] "follow condit hold"
[138] ""
[139] "y bx fa y"
[140] " y bx fb y"
[141] ""
[142] ""
[143] " formul cast gambler ruin problem waldlik"
[144] "scheme just observ cumul win player "
[145] "check whether reach lower upper line now choos"
[146] "fa fb fn 05 fa fb"
[147] "construct allow us check whether given"
[148] "configur perform wors 05 ie cross lower"
[149] "line can therefor flag overal loser control"
[150] "error probabl see detail"
[151] " close design spicer pleas consult"
[152] ""
[153] "sinc simul studi show close variant "
[154] "sequenti test exhibit low averag sampl number statist"
[155] "first look runtim perform cvst algorithm"
[156] "equip either open close sequenti "
[157] "influenti paramet term runtim "
[158] "paramet principl larger number step lead robust"
[159] "estim also increas comput time studi"
[160] "effect differ choic paramet simul "
[161] "sake simplic assum binari top flop scheme"
[162] "consist independ bernoulli variabl "
[163] " 09 10 00 01 "
[164] " open close sequenti compar relat"
[165] "speedup cvst algorithm compar full 10fold"
[166] "crossvalid case learner cubic"
[167] ""
[168] " speed gain fast"
[169] " crossvalid compar full crossvalid assum"
[170] " train time cubic number sampl shown simul"
[171] " runtim 10fold crossvalid differ problem class"
[172] " differ loserwinn ratio easi 31 medium 11 hard 13"
[173] " 200 resampl"
[174] ""
[175] "figur show result simul"
[176] "runtim differ set overal speedup much higher"
[177] " close sequenti indic aggress behavior"
[178] "compar conserv open altern test show"
[179] " highest increas rang 10 20 step rapid"
[180] "declin toward higher step number term speed close"
[181] "sequenti definit beat conserv open"
[182] ""
[183] ""
[184] ""
[185] " negat generat close top open bottom"
[186] " sequenti nonstationari configur ie "
[187] " given chang point bernoulli variabl chang"
[188] " indic valu 10"
[189] ""
[190] "figur reveal"
[191] " speed gain come price apart control "
[192] "secur zone number fals drop configur much"
[193] "higher open sequenti definit advantag"
[194] " open term speed fals negat rate close"
[195] " render useless cvst algorithm"
[196] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file013.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[11]]
$content
[1] " conclus"
[2] ""
[3] ""
[4] " present method speed crossvalid procedur"
[5] "start subset full train size identifi"
[6] "clear underperform paramet configur earli focus"
[7] " promis candid larger subset size "
[8] "discuss take subset theoret"
[9] "advantag compar heurist like local search "
[10] "paramet effect error systemat"
[11] " can understood statist one hand show"
[12] " optim configur converg true one sampl size"
[13] "tend infin also discuss concret set "
[14] "differ behavior estim error approxim error lead"
[15] " much faster converg practic insight led "
[16] "introduct safeti zone sequenti test"
[17] "ensur underperform configur remov"
[18] "prematur minima converg yet experi"
[19] "show procedur lead speed 120 time"
[20] "compar full crossvalid without signific increas"
[21] " predict error"
[22] ""
[23] " will interest combin method procedur"
[24] "like hoeff race algorithm multiarm bandit"
[25] "problem furthermor get accur converg bound even"
[26] "finit sampl size set anoth topic futur research"
[27] " moment cvst algorithm precondit "
[28] "optim paramet learn method independ "
[29] "train size standard machin learn"
[30] "algorithm adher precondit method like"
[31] "knearest neighbor exhibit connect optim"
[32] "paramet train size valuabl extens"
[33] " cvst incorpor kind knowledg algorithm"
[34] " lower precondit learner make fast"
[35] "crossvalid procedur avail learn method"
[36] ""
[37] " structur point view can see problem"
[38] "select optim learner given problem can deconstruct"
[39] " decoupl sever layer first defin problem"
[40] "domain complex solv via expect risk "
[41] "learner exploit converg risk subset abl"
[42] " repres perform learner binari fashion whether"
[43] " belong top perform round observ"
[44] "properti increas subset size can give statist bound"
[45] "guarante overal perform via sequenti test"
[46] " even use robust test earli stop whole procedur"
[47] " case chang can expect layer"
[48] "probabilist preprocess model allow stepbystep"
[49] "construct process combin one toolchain solv"
[50] "problem select plan releas toolbox"
[51] " open sourc public avail cran packag see"
[52] " cvst algorithm can use one layer "
[53] "project"
[54] ""
[55] ""
[56] ""
[57] ""
[58] ""
[59] ""
[60] ""
[61] ""
[62] " network infrastructur intern network"
[63] " separ socal demilitar zone dmz dmz"
[64] " contain server access outsid "
[65] " secur firewal web server harden "
[66] " applic firewal act revers proxi outsid"
[67] ""
[68] ""
[69] ""
[70] "respons secur incid requir accur analysi"
[71] " under detect incid "
[72] "network secur two kind model establish"
[73] " secur defin via rule kind"
[74] " traffic will allow wherea secur"
[75] " fix allow traffic therefor case "
[76] " secur intrus detect system issu"
[77] "alarm one rule match incom traffic case"
[78] " secur alarm issu "
[79] "observ traffic deviat allow traffic"
[80] ""
[81] "apart choos type secur network"
[82] "administr also choos suitabl point deploy"
[83] " network infrastructur figur show"
[84] "simpl network infrastructur intern network "
[85] "client organ resid socal demilitar zone"
[86] "dmz servic access"
[87] " outsid organ entri dmz seal "
[88] "firewal act central point entri departur"
[89] "network traffic intern network dmz "
[90] "outsid world dmz act anoth safeti zone protect"
[91] "client organ servic insid dmz "
[92] "restrict access intern network thus one "
[93] "servic get compromis attack will abl spread insid"
[94] " intern network"
[95] ""
[96] " harden specif servic one can also deploy applic"
[97] "firewal like revers proxi shown"
[98] "figur servic act like protect"
[99] "web server ie request web server will first"
[100] "process revers proxi socal web applic"
[101] "firewal waf inspect incom request applic level"
[102] "sinc waf know request web server"
[103] "encod hypertext transfer protocol http detect"
[104] "malici behavior can take place finegrain level"
[105] ""
[106] " intrus detect system mere report suspici"
[107] "network traffic intrus system ip react"
[108] "real time suspici activ either"
[109] "traffic part request promin exampl"
[110] " system inlin deploy just"
[111] "like firewal figur waf"
[112] " act like revers proxi figur"
[113] "solut work rulebas detect engin defin"
[114] "negat secur ie unless rule insid"
[115] "system specif exploit these system abl protect"
[116] " infrastructur servic hand"
[117] ""
[118] "approach base negat secur model prone"
[119] " two major drawback first requir extens applic"
[120] "attack knowledg maintain reliabl rule base furthermor"
[121] " capabl detect known attack leav system wide"
[122] "open socal attack pattern"
[123] " avail suffic rule base simpli uptod"
[124] " order attack slip thus focus posit"
[125] "secur learn normal help circumv"
[126] " problem"
[127] ""
[128] " machin learn problem direct map domain"
[129] " learn use structur featur"
[130] "chapter can learn normal"
[131] "instanc observ distribut distanc centroid"
[132] " observ specif token new request"
[133] " can check token whether obtain distanc "
[134] "previous calcul centroid statist sound rang"
[135] "accord observ distribut sinc differ token web"
[136] "request exhibit differ behavior centroid might fit"
[137] " type token approach like markov model even simpl"
[138] "lookup tabl can often appli success"
[139] ""
[140] " chapter present method attain level"
[141] "respons flexibl waf name abil drop"
[142] " also heal malici request without relianc known"
[143] "pattern approach base anomali detect carri "
[144] " granular http request token contrast previous"
[145] "applic anomali detect web attack detect eg"
[146] ""
[147] " method detect react attack"
[148] ""
[149] " develop prototyp revers proxi call"
[150] "implement idea mangl coupl anomali detect "
[151] "prototyp http request pars tokenvalu pair "
[152] "compar learn profil normal content specif"
[153] "token token deviat typic profil "
[154] "replac appropri benign valu use tokenspecif heurist"
[155] ""
[156] " propos request heal techniqu simpl effect henc"
[157] " can effici implement deploy highspe"
[158] "network main advantag socal request"
[159] " simpl drop request decis made "
[160] "precis context specif token instead full request"
[161] "great improv detect accuraci verifi experiment"
[162] "comparison detect request level make decis"
[163] "faulttoler sinc replac content suitabl"
[164] "altern certain case harm even wrong"
[165] "classifi malici"
[166] ""
[167] " system high customiz especi reason"
[168] "automat mangl may desir eg privaci reason one"
[169] "may want automat replac user name password field"
[170] "valu howev case manual configur "
[171] "mandatori automat setup procedur provid "
[172] "decis engin advantag abil"
[173] "learn contamin enabl deploy"
[174] "minim manual effort"
[175] ""
[176] " summari contribut chapter"
[177] ""
[178] " propos web applic firewal decid base anomalybas model part request anomal need replac benign part observ past"
[179] " addit employ setup procedur assign type extract token statist featur"
[180] " clean attackfre need sinc "
[181] " learn model well test robust contamin"
[182] " addit attack need learn"
[183] " setup procedur"
[184] ""
[185] ""
[186] " remain part chapter organ follow"
[187] " introduc methodolog "
[188] "selfheal web applic firewal present prototyp"
[189] " evalu detect perform runtim use real"
[190] "http traffic relat work discuss"
[191] " conclus given"
[192] ""
[193] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file014.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[12]]
$content
[1] " token doctor"
[2] ""
[3] ""
[4] " automat heal malici web request entail follow"
[5] "two essenti task identif malici content "
[6] "construct replac content unlik regular intrus"
[7] "detect problem decid whether request"
[8] "malici identif problem complex sinc"
[9] "one need determin presenc malici content"
[10] "also locat prerequisit replac"
[11] ""
[12] " identif problem can address make decis "
[13] "refin context specif paramet http request"
[14] "contextdepend detect previous use"
[15] " uri paramet get request "
[16] " full http request content follow"
[17] "rough idea request pars tokenvalu"
[18] "pair request discard point"
[19] " instead combin score tokenbas model make"
[20] "decis request level decis made token"
[21] "level content token deem anomal"
[22] "appropri replac sought"
[23] ""
[24] ""
[25] ""
[26] ""
[27] " system act revers"
[28] " proxi intercept client request examin"
[29] " potenti alter request deliv product"
[30] " system"
[31] ""
[32] ""
[33] ""
[34] " divers web applic traffic content essenti rule"
[35] " possibl onesizefitsal traffic can"
[36] "learn henc attempt provid right anomali detect"
[37] "algorithm appropri heal action particular token"
[38] " result design compris follow three orthogon"
[39] "compon"
[40] ""
[41] ""
[42] " type base analysi real http traffic"
[43] " postul four token type describ characterist"
[44] " distribut token content"
[45] ""
[46] " detector sinc malici content can manifest"
[47] " various featur eg unusu length previous"
[48] " unseen attribut differ anomali detect algorithm can"
[49] " use captur attack employ anomali detector"
[50] " automat coupl particular token setup"
[51] " procedur present "
[52] ""
[53] " action particular transform token"
[54] " content denot heal action propos four"
[55] " heal action similar detector depend particular"
[56] " token type also configur setup procedur"
[57] ""
[58] ""
[59] " oper point view can character "
[60] "revers proxi similar function traffic normal"
[61] " intercept http client request "
[62] "pars accord http protocol specif potenti"
[63] "modifi request content relay traffic actual product"
[64] "system contrast system propos "
[65] " requir target product server run"
[66] "differ secur level securityrel decis"
[67] "encapsul proxi signific simplifi"
[68] "design network environ protect"
[69] ""
[70] " architectur shown figur"
[71] " exampl decis present "
[72] "figur token uri paramet deem"
[73] "normal remain unalt wherea anomal valu"
[74] " flag heal exampl valu"
[75] "automat replac previous seen benign highest"
[76] "similar ie string correspond sql inject attack"
[77] " replac benign string due occurr"
[78] " string attack"
[79] ""
[80] " follow section provid detail descript "
[81] "three orthogon compon follow present"
[82] " setup procedur tie three compon togeth"
[83] ""
[84] " type"
[85] ""
[86] ""
[87] "everi request receiv pars syntact part"
[88] "accord http specif store tokenvalu"
[89] "pair treat uri path paramet get post request"
[90] " header field separ token follow respect"
[91] "valu exampl request figur"
[92] "pars five token correspond paramet header"
[93] "denot insid revers proxi note request paramet"
[94] " differ name will present differ token"
[95] ""
[96] " distribut token content high divers token"
[97] "eg contain small number possibl valu"
[98] "sometim even constant valu token queri"
[99] " rang paramet contain wide varieti valu may"
[100] "even generat automat order systemat handl"
[101] " divers classifi token four type"
[102] "accord typic properti inform transmit http"
[103] "request"
[104] ""
[105] ""
[106] " simplest case valu token"
[107] " take valu exampl header field"
[108] " monitor particular web host"
[109] ""
[110] " second type token carri "
[111] " take small valu depend either http"
[112] " protocol web applic exampl token"
[113] " header"
[114] ""
[115] " input third type token compris"
[116] " machinegener session number identifi"
[117] " cooki"
[118] ""
[119] " input complex token type induc"
[120] " human input freetext field queri string comment"
[121] " name enter exhibit semant structur"
[122] " except generat natur languag"
[123] ""
[124] ""
[125] " characterist featur differ token type taken"
[126] " account choic anomali detect algorithm heal"
[127] "action"
[128] ""
[129] ""
[130] " detector"
[131] ""
[132] ""
[133] "anomali detect method wide studi protect"
[134] "web servic"
[135] ""
[136] "howev previous approach flag anomali full http request"
[137] " henc direct appli trigger finegrain"
[138] "action individu token deploy pertoken anomali"
[139] "detect algorithm propos howev"
[140] "decisionmak remain token level"
[141] ""
[142] " choic anomali detect method depend token"
[143] "type constant enumer token straightforward"
[144] "occurr check refer list detector natur choic"
[145] " given valu seen train deem"
[146] "anomal remain two token type deploy three differ"
[147] "detector describ decis detector "
[148] "appli specif token automat made setup"
[149] "process present "
[150] ""
[151] ""
[152] " centroid anomali detector ncad"
[153] ""
[154] ""
[155] "ngram model wide use secur applic"
[156] " deploy"
[157] "embed techniqu propos provid"
[158] "effici way ngram analysi"
[159] ""
[160] "given possibl ngram byte sequenc w"
[161] " 255 defin embed function w"
[162] " token valu x "
[163] ""
[164] " w"
[165] ""
[166] ""
[167] " return 1 ngram w"
[168] "contain x 0 otherwis result vector"
[169] "normal one elimin lengthdepend vector space"
[170] "induc embed ngram grow exponenti n"
[171] "howev spars linear length sequenc"
[172] "allow one effici construct compar embed vector"
[173] " byte sequenc detail "
[174] ""
[175] " embed function hand euclidean distanc"
[176] "embed vector can defin follow"
[177] ""
[178] "dexz w"
[179] ""
[180] ""
[181] "use distanc detect can perform comput"
[182] "distanc previous learn normal"
[183] ""
[184] ""
[185] ""
[186] " de"
[187] ""
[188] ""
[189] ""
[190] " vector construct train x"
[191] " xn arithmet mean respect embed"
[192] "vector threshold"
[193] " determin independ valid describ"
[194] ""
[195] ""
[196] " chain anomali detector mcad"
[197] " markov chain previous use sever"
[198] "secur applic"
[199] " use"
[200] " 256 possibl byte valu state markov chain 256"
[201] "possibl state transit see appendix"
[202] "detail state transit probabl can learn record"
[203] "transit frequenc byte bi bj train"
[204] " includ extra start state overal size "
[205] "transit tabl 2562 256 prohibit"
[206] "larg learn transit probabl can estim"
[207] " probabl token valu x length n base "
[208] "learn markov chain"
[209] ""
[210] "p px1 x1"
[211] " xi1xi xi"
[212] ""
[213] " xi correspond ith byte token valu x "
[214] " use length normal instanc appli geometr"
[215] "mean want detector take content length"
[216] " account equip tokenspecif markov chain "
[217] "threshold mcad new valu x defin follow"
[218] ""
[219] ""
[220] ""
[221] " p"
[222] ""
[223] ""
[224] ""
[225] ""
[226] " anomali detector lad often length"
[227] " token valu characterist attack exampl"
[228] " major buffer overflow attack exhibit long token"
[229] "valu properti address lad detector "
[230] " solut token insuffici amount"
[231] " render learn task involv train ncad mcad"
[232] "imposs therefor find solut can cope "
[233] "scarc situat modern robust statist provid us"
[234] "power tool special deal noisi even"
[235] "small sampl size especi small sampl size estim "
[236] "mean standard deviat use chebyshev inequ"
[237] "instanc can extrem outlier depend "
[238] " statist base bias estim can loos"
[239] ""
[240] "henc instead use base chebyshev inequ"
[241] "decid employ robust statist describ"
[242] " given predefin signific level"
[243] "estim 1 quantil length distribut"
[244] " train valid l name l1"
[245] " now construct confid interv l1"
[246] " first calcul bootstrap estim "
[247] "standard error l1 name"
[248] "determin paramet c interv"
[249] ""
[250] " l1 c l1 c"
[251] ""
[252] " probabl coverag 1 final choos"
[253] "upper bound confid interv threshold lad"
[254] "detector allow futur variabl result "
[255] "follow decis rule"
[256] ""
[257] ""
[258] ""
[259] " l1 c"
[260] ""
[261] ""
[262] ""
[263] ""
[264] " action"
[265] ""
[266] ""
[267] " finegrain detect token level allow us devis"
[268] "similar finegrain heal action henc automat respons"
[269] "mechan can less intrus accur action taken"
[270] " request level particular equip "
[271] "follow heal action"
[272] ""
[273] ""
[274] " token conserv"
[275] " respons spot anomal token valu remov"
[276] " token request notic still much"
[277] " benign action drop request use action"
[278] " token lad detector"
[279] ""
[280] " encod altern still"
[281] " conserv strategi encod anomal valu use html"
[282] " entiti approach make common web attack base crosssit"
[283] " script sql inject fail control punctuat"
[284] " charact escap action provid almost damag"
[285] " benign request mani web applic can resolv addit"
[286] " encod content"
[287] ""
[288] " frequent valu"
[289] " constant enumer token type natur heal action "
[290] " replac valu frequent normal valu "
[291] " token natur action assign token "
[292] " list detector"
[293] ""
[294] " nearest valu "
[295] " involv heal action replac anomal valu "
[296] " nearestneighbor train replac possibl"
[297] " due embed valu metric space introduc"
[298] " default action"
[299] " mcad ncad note sideeffect action can also"
[300] " correct typo userinput field"
[301] ""
[302] ""
[303] "clear four heal action tight coupl "
[304] "particular type consid token precis assign"
[305] " heal action token type present"
[306] " also allow administr"
[307] " tighten propos default action special token need"
[308] "extra protect like password file cooki"
[309] ""
[310] " tokdoc"
[311] ""
[312] ""
[313] ""
[314] ""
[315] " 12 0 10cliptruewidth75"
[316] " test procedur setup "
[317] " servicespecif split train test"
[318] " procedur decid token detector "
[319] " use exploit structur statist featur"
[320] " automat process total datadriven x denot train specif token "
[321] ""
[322] ""
[323] ""
[324] "sinc main compon base learn method"
[325] " setup depend avail initi corpus"
[326] "normal train valid initi suffici"
[327] "larg pool client request separ accord"
[328] "servic eg virtual host andor differ web servic"
[329] "allow servicespecif learn model pars"
[330] "use generat tokenspecif pool use follow"
[331] "phase amount chosen accord traffic"
[332] "volum widest possibl rang normal behavior"
[333] "cover note servicespecif split process "
[334] " improv prisma framework introduc"
[335] "chapter generat even focus"
[336] "statespecif instanc"
[337] ""
[338] " test framework depict figur determin"
[339] " token automat datadriven detector"
[340] " assign exploit structur statist"
[341] "featur use robust outlierresist statist procedur"
[342] "ensur meaning decis even dirti attacktaint"
[343] " set collect split two equal"
[344] "size part train pool use learn "
[345] " token threshold estim use valid"
[346] " semiautomat assign action outlier"
[347] "adjust threshold token system readi"
[348] " deploy learn alreadi discuss"
[349] " now describ detail"
[350] " part setup process"
[351] ""
[352] " datadriven detector assign depict"
[353] "figur step consist structur "
[354] "statist carri token "
[355] "origin start size procedur assign"
[356] "simpl lad detector train current test"
[357] "token contain 50 less sampl rational "
[358] " detector need reason amount estim"
[359] " model 50 sampl avail procedur"
[360] "check whether current token enumer observ"
[361] "less 10 uniqu valu token procedur test"
[362] "statist evid exploit well known"
[363] "first defin list c d train valid"
[364] "describ whether sampl valid "
[365] "observ train can defin function"
[366] " return pvalu "
[367] "whether c generat binomi variabl generat"
[368] "fals probabl true probabl 1"
[369] " now can determin maxim bare"
[370] "support accept hypothesi c generat"
[371] " given signific level"
[372] ""
[373] ""
[374] ""
[375] ""
[376] " valu impress possibl nonmatch"
[377] "occurr token might occur futur"
[378] "similar can interpret upper bound confid"
[379] "interv empir observ thus can use"
[380] "valu threshold expect falseposit per token "
[381] "list type"
[382] ""
[383] " decid ncad mcad procedur first look"
[384] " structur featur name median length token sinc"
[385] " ncad detector base 2gram detector need least two"
[386] "charact calcul meaning mean distanc "
[387] "token pass structur procedur focus "
[388] "statist properti observ given centroid"
[389] "largest distanc bound d"
[390] " 1 sinc normal length"
[391] "one use kernel densiti estim valid see"
[392] " instanc detail can measur"
[393] "bound probabl maxim distanc ever attain"
[394] "formal px d 0"
[395] ""
[396] " ncad mcad need threshold oper sinc"
[397] "model focus specif token can choos relat"
[398] "relax threshold polici propos use maxim distanc"
[399] " ncad minim probabl mcad semiautomat"
[400] "outlier adjust valu valid "
[401] " order accord output detector descend"
[402] "distanc mean ncad ascend probabl mcad"
[403] " administr decid whether extrem valu real"
[404] "usergener sampl malici token valu exampl"
[405] "sort probabl figur clear show"
[406] " extrem three valu induc malici input"
[407] "therefor administr adjust threshold first"
[408] "usergener request"
[409] " procedur administr addit can check"
[410] "qualiti assign detector see whether chosen"
[411] "fit actual "
[412] ""
[413] " addit can address privaci secur issu refin"
[414] "assign action administr can manual adjust whether"
[415] "token heal drop complet instanc privaci"
[416] "relat cooki password must replac "
[417] "nearest counterpart instead drop complet"
[418] "prevent potenti abus like session password hijack"
[419] ""
[420] " system produc fals posit deploy can"
[421] "track token caus fals alarm thus"
[422] "administr can focus specif token can reconfigur"
[423] "system accord incid case websit"
[424] "restructur new servic deploy may "
[425] " adjust accord potenti lead retrain "
[426] "token model"
[427] ""
[428] ""
[429] ""
[430] ""
[431] " setup consol automat detector assign administr check calcul threshold outlier"
[432] ""
[433] ""
[434] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file015.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[13]]
$content
[1] ""
[2] ""
[3] ""
[4] "evalu intrus prevent system multifacet"
[5] "task sinc effect respons action inher depend"
[6] " accuraci malici content identif first evalu"
[7] " accuraci detector compar overal perform"
[8] " stateoftheart method check realtim readi"
[9] " runtim assess compar proxi"
[10] ""
[11] " perform"
[12] ""
[13] " evalu detect perform collect network"
[14] "trace two differ internet domain first first08"
[15] "compris 60 day traffic 1452122 http request record"
[16] " web server research institut"
[17] " 2008 server provid static content well dynam page"
[18] "use content manag system second"
[19] " blog09 cover 33day traffic 1181941 request"
[20] " obtain domain run various weblog 2009"
[21] "blog run popular publish platform"
[22] " evalu set split three equal size"
[23] "part train valid test due differ web"
[24] "applic amount monitor token well assign"
[25] " anomali detector differ set "
[26] "configur use set present tabl"
[27] ""
[28] ""
[29] ""
[30] ""
[31] ""
[32] ""
[33] " detector first08"
[34] ""
[35] " categori list lad"
[36] " mcad ncad"
[37] ""
[38] " header 14 14 5 10 43"
[39] " paramet 9 3 4 16"
[40] " path 1 1"
[41] ""
[42] " 23 17 10 10 60"
[43] " detector blog09"
[44] ""
[45] " categori list lad"
[46] " mcad ncad"
[47] ""
[48] " header 22 77 15 17 131"
[49] " paramet 14 166 28 7 215"
[50] " path 1 1"
[51] ""
[52] " 36 243 44 24 347"
[53] ""
[54] ""
[55] " configur use experi column categori summar token token origin header paramet queri path token introduc "
[56] ""
[57] ""
[58] ""
[59] " addit regular network traffic collect network"
[60] "attack base 35 exploit obtain framework"
[61] " well common secur archiv milw0rm packet"
[62] "storm bugtraq attack execut virtual"
[63] "environ thorough adapt characterist two"
[64] " set detail list consid attack exploit"
[65] " given tabl result variat"
[66] " record eg usag differ shellcod encod sql"
[67] "statement attack pool contain 89 attack instanc first08"
[68] " 97 attack blog09"
[69] ""
[70] ""
[71] ""
[72] " l l"
[73] ""
[74] " cve milworm token detector"
[75] ""
[76] " buffer overflow attack"
[77] ""
[78] " mcad near 282930"
[79] " mcad near 34353637"
[80] " protocol viol parser struct drop 313233"
[81] " protocol viol parser bodi drop 456"
[82] " param normal drop 0123"
[83] " protocol viol parser uri drop 2425"
[84] " protocol viol parser struct drop 2627"
[85] " protocol viol parser uri drop 71727374"
[86] " param normal drop 636465"
[87] " protocol viol parser uri drop 49505152"
[88] " mcad near 4546"
[89] " mcad near 686970"
[90] " list freq 20212223"
[91] " ncad near 2023"
[92] ""
[93] " code inject attack"
[94] ""
[95] " param normal drop 10111213"
[96] " param normal drop 56575859"
[97] " mcad drop 606162"
[98] " param normal drop 171819"
[99] " mcad drop 4748"
[100] " mcad near 789"
[101] " php param normal drop 8285"
[102] " sql param s mcad enc"
[103] ""
[104] " wordpress attack"
[105] ""
[106] " mcad drop 101"
[107] " param mcad near 102 103"
[108] " mcad drop 105"
[109] " param mcad near 104"
[110] " param normal drop 108"
[111] " param normal drop 110"
[112] " param lad drop 109"
[113] " param normal drop 107"
[114] " param lad drop 106"
[115] " ncad near 100"
[116] ""
[117] " miscellan attack"
[118] ""
[119] " httptunnel protocol viol parser struct drop 868788"
[120] " protocol viol parser uri drop 6667"
[121] " protocol viol parser struct drop 535455"
[122] " xsssql param mcad enc 75767778"
[123] ""
[124] ""
[125] " http attack"
[126] "tabl http exploit attack execut differ variant"
[127] " list token attack locat detector case token never seen normal request drop token"
[128] ""
[129] ""
[130] ""
[131] " learner request semant"
[132] ""
[133] "first want check whether differ detector model"
[134] "realli necessari construct special instanc"
[135] " just lad detector instead mcad ncad"
[136] "refer just ncad ie mcad"
[137] " replac ncad refer mcad"
[138] "ie ncad replac mcad refer "
[139] " evalu instanc "
[140] " first08 blog09 reject request manual"
[141] "check label fals true posit case doubt"
[142] "request replay origin server follow first"
[143] "use unmodifi request save repli server "
[144] "compar repli server send request modifi"
[145] " differ count request fals"
[146] "posit complet replay process observ"
[147] " sever drastic repli server indic malform"
[148] "even malici request prove inher request"
[149] "semant harm action "
[150] ""
[151] ""
[152] ""
[153] ""
[154] " detector fp tp fn"
[155] ""
[156] " first08 000002 0 000000"
[157] " 000000 0 002247"
[158] " 000001 0 000000"
[159] " 000002 0 022472"
[160] ""
[161] " blog09 000003 212 004124"
[162] " 000001 68 015464"
[163] " 000009 186 004124"
[164] " 000003 0 022680"
[165] ""
[166] ""
[167] " perform sever instanc fp falseposit rate tp attack found normal traffic fn falseneg rate"
[168] ""
[169] ""
[170] " result summar tabl first"
[171] "thing notic overal low falseposit rate "
[172] "direct result addit pars local decis make"
[173] " closer look reveal "
[174] " suffer high falseneg rate "
[175] " perform equal good first08 "
[176] "fall behind involv blog09 plain"
[177] " divers model method"
[178] "perform near ident set also capabl"
[179] "detect true posit taint blog09 "
[180] "trend confirm tabl"
[181] "present detector necessari disarm use attack note"
[182] " malici part attack spread throughout differ"
[183] "token render approach even"
[184] "valuabl addit heal action employ "
[185] "save rough 00001 discard fals"
[186] "posit blog09 first08 summari"
[187] "result show just full varieti model embodi"
[188] " lead overal good perform keep"
[189] "general request semant intact"
[190] ""
[191] " detector"
[192] " baselin detect perform consid two"
[193] "stateoftheart anomali detect techniqu use raw http request"
[194] "payload input chain detector use markov"
[195] "chain describ full content"
[196] " request anomali detect learn train"
[197] " similar calibr use valid"
[198] "partit second baselin implement variant"
[199] " detector store ngram"
[200] "benign http request bloom filter use ratio unknown"
[201] "ngram incom request anomali score detector"
[202] "calibr valid n fix 2"
[203] ""
[204] " result evalu summar"
[205] "tabl first08 blog09 report"
[206] " equal falseposit rate detector"
[207] "calibr trueposit rate "
[208] " rate miss regular attack detector"
[209] " calibr falseposit rate "
[210] ""
[211] ""
[212] ""
[213] ""
[214] " detector"
[215] ""
[216] " first08 000002 000000"
[217] " markov chain 002005 080899"
[218] " anagram 000004 016854"
[219] ""
[220] " blog09 000003 004124"
[221] " markov chain 016698 018557"
[222] " anagram 100000 039175"
[223] ""
[224] ""
[225] " perform payloadbas anomali detector falseposit rate detector calibr trueposit rate rate miss regular attack detector calibr falseposit rate "
[226] ""
[227] ""
[228] "focus first08 see anagram"
[229] "yield accept falseposit rate howev anagram much"
[230] "porous near 17 attack detect contrari"
[231] " capabl detect attack attain even"
[232] "lower falseposit rate anagram markov chain"
[233] "simpli overburden first08 even blog09"
[234] " falseposit rate rise unaccept"
[235] "17 surpris anagram break blog09 "
[236] " calibr trueposit anagram flag"
[237] " legitim request anomal due fact"
[238] " 23 attack anomali score 0 "
[239] "smallest possibl score attain therefor tag incom"
[240] "request anomal even calibr anagram 23"
[241] "falseneg rate rough 8 time higher "
[242] "setup yield still falseposit rate 000038 "
[243] "magnitud higher number clear demonstr"
[244] "outstand perform term fals posit"
[245] " negat even hard set like blog09"
[246] ""
[247] " perform"
[248] ""
[249] " deliv inlin intrus prevent system "
[250] "reason fast sinc everi client request pass revers"
[251] "proxi without intoler delay part subject"
[252] " prototyp stress see whether can use "
[253] "realtim scenario"
[254] ""
[255] " prototyp implement use "
[256] "framework framework provid matur interfac number"
[257] "network protocol reus proxi modul integr"
[258] "optim ngram c librari abl produc"
[259] "fullfledg prototyp system replay complet"
[260] "test slice first08 blog09 approxim 500k"
[261] "request get stabl estim process time "
[262] "baselin measur process time "
[263] "proxi second consid web applic firewal"
[264] " minim setup rule assess addit pars"
[265] "affect process time request furthermor use "
[266] "simpl forward proxi applic implement "
[267] "framework see much framework impos "
[268] "process runtim final "
[269] "environ median runtim setup present"
[270] "tabl"
[271] ""
[272] ""
[273] ""
[274] ""
[275] ""
[276] " proxi"
[277] ""
[278] " modsec"
[279] ""
[280] " first08 1387 1536 2552 2768"
[281] " blog09 1500 1694 2430 2902"
[282] ""
[283] ""
[284] " runtim millisecond differ proxi"
[285] " first08 blog09"
[286] ""
[287] ""
[288] ""
[289] "first can observ two set exhibit differ"
[290] "baselin general first08 seem simpler"
[291] "structur compar blog09 furthermor high"
[292] "optim rough 1 ms per request faster"
[293] "compar equival look "
[294] "interappl differ can observ increas 01 ms"
[295] " 02 ms 02 ms 05 ms "
[296] " proxi respect impli "
[297] "addit anomali detect method employ just add"
[298] " rough 01 ms 03 ms per request experi clear"
[299] "demonstr still room improv term"
[300] " runtim anomali detect method use suitabl"
[301] " run inlin system even current"
[302] "unoptim state alreadi can use intrus prevent"
[303] "system"
[304] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file016.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[14]]
$content
[1] " work"
[2] ""
[3] ""
[4] " automat protect web applic gain increas"
[5] "attent among secur research convent id "
[6] " reli specif attack"
[7] "signatur predefin attack characterist provid"
[8] "adequ time protect dynam chang web attack"
[9] "anomali detect techniqu base payload analysi exampl"
[10] ""
[11] "provid possibl detect previous unknown attack"
[12] " approach enabl protect differ network servic"
[13] "attain suffici throughput rate yet lack protocol context"
[14] " analysi restrict use intrus prevent"
[15] "simpl drop redirect packet"
[16] ""
[17] "first protocolawar method detect attack web traffic"
[18] "use anomali detect propos "
[19] "extend ensu work"
[20] " main idea"
[21] " method combin multipl anomali detector"
[22] " length check byte distribut hidden markov model appli"
[23] " individu uri paramet similar finit state automata"
[24] " multipl markov chain"
[25] " recent propos detect anomal http request"
[26] " approach differ method detect"
[27] "anomal content individu token instead combin"
[28] "tokenlevel anomali estim judg anomali complet"
[29] "request finegrain detect enabl us devis novel token"
[30] "heal action much less disrupt request drop"
[31] ""
[32] "anoth line research combin network anomali detect host"
[33] "monitor propos system "
[34] "anomal request execut special instrument shadow"
[35] "honeypot system feedback whether request actual harm"
[36] "system can use updat anomali detector similar"
[37] " work line idea"
[38] " combin sql attack detect revers proxi"
[39] "forward request web server manag differ level"
[40] "sensit inform depend anomali valu web"
[41] "request sql queri anomali detector host decid whether"
[42] " model web request anomali detector updat"
[43] " request result malici databas queri "
[44] "requir addit host instrument serv "
[45] "transpar proxi great simplifi practic deploy"
[46] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file017.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$content[[15]]
$content
[1] " conclus"
[2] ""
[3] ""
[4] " introduc protocolawar revers proxi "
[5] "capabl decid token level part request"
[6] " deem normal anomal sever intellig mangl"
[7] "strategi anomal token apart just drop "
[8] " describ experi realworld set demonstr"
[9] "use approach runtim measur show readi"
[10] " inlin intrus prevent"
[11] ""
[12] " prototyp show good perform especi term"
[13] "fals negat awar sever extens can improv"
[14] " extend system practic consider includ"
[15] "integr platform"
[16] " valuabl step toward runtim improv coupl"
[17] " system shadow system propos"
[18] " incorpor feedback loop "
[19] "similar vein combin learn"
[20] "techniqu anoth promis extens case "
[21] " abl leverag knowledg under protocol "
[22] "effici preprocess pars incom request"
[23] "accord http grammar complex web applic"
[24] "web server serv multitud servic might "
[25] "signific overlap differ use token type accord "
[26] "under state servic differ type servic"
[27] "host thus applic method chapter"
[28] " layer respons mechan benefici"
[29] " session track state servic accord "
[30] "infer abstract state machin anomali model sequenc "
[31] "depend state even finer distinct abnorm"
[32] "behavior achiev mechan also amount "
[33] "integr sessionawar long term memori "
[34] " use flag user session lot"
[35] "anomal token seen danger use"
[36] "reliabl close suspici connect even track attack"
[37] "distribut sever request"
[38] ""
[39] "overal system proven promis fullfledg"
[40] "web applic firewal present state capabl"
[41] "effect prevent heal wide rang recent"
[42] "webbas attack runtim perform make readili applic"
[43] " protect modern web applic see layer"
[44] "structur process toolchain pars request"
[45] " emb token specif chosen local represent"
[46] " enabl subsequ check statist signific"
[47] "aberr learn normal model"
[48] "approach even taken step sens map back"
[49] " outcom probabilist meaning"
[50] "healingreact act transform inform"
[51] " probabilist domain back real world layer"
[52] "structur approach permit stepwis develop"
[53] "success applic probabilist method realworld"
[54] "problem"
[55] ""
[56] ""
[57] ""
[58] ""
[59] ""
[60] ""
[61] " chapter summar find thesi start"
[62] "review main part analysi detect respons"
[63] "framework develop preced chapter inspect"
[64] " individu compon framework relat "
[65] "frequentist view statist gain insight "
[66] "precondit applic postcondit probabilist"
[67] "method network secur conclud chapter outlook"
[68] " implic find relat classic"
[69] "insight softwar engin potenti pitfal"
[70] "merit realworld applic"
[71] ""
$meta
$author
character(0)
$datetimestamp
[1] "2014-06-11 21:28:01 GMT"
$description
character(0)
$heading
character(0)
$id
[1] "file018.tex"
$language
[1] "en"
$origin
character(0)
attr(,"class")
[1] "TextDocumentMeta"
attr(,"class")
[1] "PlainTextDocument" "TextDocument"
$meta
list()
attr(,"class")
[1] "CorpusMeta"
$dmeta
data frame with 0 columns and 15 rows
attr(,"class")
[1] "VCorpus" "Corpus"
> thesis = corpusToPrisma(thesis, NULL, TRUE)
Loading required package: tm
Error: inherits(doc, "TextDocument") is not TRUE
Execution halted
Flavor: r-oldrel-windows-ix86+x86_64