Figure2:Theadaptiveimportan esamplingforBayesianNetworks(AIS-BN)algorithm. s orepro essinginStep10is w iSore =w k Pr(s i ;e) Pr k (s i ) : Note that inthisrespe t the algorithminFigure1 be omes a spe ial ase of AIS-BN whenw k = 1. Thereasonwhyweusew k isthat wewantto givedi�erentweightsto thesamplingresultsobtainedatdi�erentstagesofthealgorithm. Aseahstageupdates theimportan efun tion,theywillallhavedi�erentdistan efromtheoptimalimportan e fun tion.Were ommendthatw k /1= b � k ,where b � k isthestandarddeviationestimatedin stagekusingEquation3. 1 Inordertokeepw k monotoni allyin reasing,ifw k issmaller thanw k�1 ,weadjustitsvaluetow k�1 . Thisweightingshememayintrodu ebiasinto
Figure3:TheAIS-BNalgorithmforlearningtheoptimalimportan efun tion. 167
Figure4:Theprobabilitydistributionofeviden ePr(E=e)inourexperiments. Wegeneratedatotalof75test ases onsistingof�vesequen esof15test aseseah.We raneahtest ase10times,eahtimewithadi�erentsettingoftherandomnumberseed. Eahsequen ehadaprogressivelyhighernumberofeviden enodes: 15,20,25,30,and 35eviden enodesrespe tively. Theeviden enodeswere hosenrandomly(equiprobable samplingwithoutrepla ement)fromthosenodesthatdes ribedvariousplausiblemedi al 173
Figure5: Atypi alplotof onvergen eofthetestedsamplingalgorithmsinourexperiments |MeanSquareErrorasafun tionoftheexe utiontimeforasubsetofthe CPCSnetworkwith20eviden enodes hosenrandomlyamongplausiblemedi al observations(Pr(E=e)=3:33�10 �26 inthisparti ular ase)fortheAIS-BN, theSIS,andtheLWalgorithms. The urvefortheAIS-BNalgorithmisvery losetothehorizontalaxis. Figures5and6showatypi alplotof onvergen eofthetestedsamplingalgorithmsin ourexperiments.The aseillustratedinvolvesupdatingtheCPCSnetworkwith20eviden e nodes. We plottheMSEaftertheinitial15se ondsduringwhihthealgorithmsstart onverging. Inparti ular,thelearningstepoftheAIS-BNalgorithmisusually ompleted withinthe�rst9se onds. Weranthethreealgorithmsinthis asefor150se ondsrather thanthe60se ondsinthea tualexperimentinordertobeabletoobserveawiderrangeof onvergen e. TheplotoftheMSEfortheAIS-BNalgorithmalmosttouhestheXaxisin Figure5. Figure6showsthesameplotina�ners aleinordertoshowmoredetailinthe AIS-BN onvergen e urve. Itis learthattheAIS-BNalgorithmdramati allyimproves the onvergen erate.We analsoseethattheresultsofAIS-BN onvergetoexa tresults veryfastasthesamplingtimein reases.Inthe ase apturedinFigures5and6,atenfold in reaseinthesamplingtime(aftersubtra tingtheoverheadfortheAIS-BNalgorithm,it orrespondstoa21.5-foldin reaseinthenumberofsamples)resultsina4.55-foldde rease 174
Figure6: ThelowerpartoftheplotofFigure5showingthe onvergen eoftheAIS-BN algorithmto orre tposteriorprobabilities. oftheMSE(toMSE�0:00048). Theobserved onvergen eofbothSISandLWalgorithms waspoor.Atenfoldin reaseinsamplingtimehadpra ti allynoe�e tona ura y. Please notethatthisisaverytypi al aseobservedinourexperiments. Table1: Afragmentofthe onditionalprobabilitytableofanodeoftheCPCSnetwork (nodegasAute,parentshepAute=MildandwbTotTho=False)inFigure6. Figure7illustratestheICPTlearningpro essoftheAIS-BNalgorithmforthesample aseshowninFigure6.Thedisplayed onditionalprobabilitiesbelongtothenodegasAute whih is a parent of two eviden e nodes, difInfGasMu and abdPaiExaMea. The node gasAutehasfourstates: \absent," \mild,"\moderate," and\severe", andtwoparents. Werandomly hosea ombinationofitsparents'statesasourdisplayed on�guration.The originalCPTforthis on�gurationwithouteviden e,theexa tICPTwitheviden eand thelearnedICPTwitheviden earesummarizednumeri allyinTable1.Figure7illustrates thatthelearnedimportan e onditionalprobabilitiesbeginto onvergetotheexa tresults 175
Figure7: Convergen eofthe onditionalprobabilitiesduringtheexamplerunoftheAISBNalgorithm apturedinFigure6. Thedisplayedfragmentofthe onditional probabilitytablebelongstonodegasAutewhihisaparentofoneoftheeviden e nodes. stablyafterthreeupdatingsteps. ThelearnedprobabilitiesinStep10are loseto the exa tresults.Inthisexample,thedi�eren ebetweenPr(x i jpa(X i );e)andPr(x i jpa(X i ))is verylarge. SamplingfromPr(x i jpa(X i ))insteadofPr(x i jpa(X i );e)wouldintrodu elarge varian eintoourresults. Table2: Summaryofthesimulationresultsforallofthe75simulation asesontheCPCS network. Figure8showseahofthe75 asesgraphi ally. Figure8showstheMSEforall75test ases inourexperimentswiththesummary statisti sinTable2. Apairedone-tailedt-testresultedinstatisti allyhighlysigni� ant di�eren esbetweentheAIS-BNandSISalgorithms(p<3:1�10 �20 ),andalsobetween theSISandLWalgorithms(p<1:7�10 �8 ). Asfarasthemagnitudeofdi�eren eis 176
Figure8: Performan eoftheAIS-BN,SIS,andLWalgorithms: MeanSquareErrorfor eahofthe75individualtest asesplottedagainsttheprobabilityofeviden e. Thesamplingtimeis60se onds. on erned,AIS-BNwassigni� antlybetterthanSIS.SISwasbetterthanLW,butthe di�eren ewassmall. ThemeanMSEsofSISandLWalgorithmswerebothgreaterthan 0:1,whihsuggeststhatneitherofthesealgorithmsissuitableforlargeBayesiannetworks. ThegraphinFigure9showstheMSEratiobetweentheAIS-BNandSISalgorithms. We anseethattheper entageofthe aseswhoseratiowasgreaterthan100(twoorders ofmagnitudeimprovement!)is60%.Inotherwords,weobtainedtwoordersofmagnitude improvementinMSEinmorethanhalfofthe ases. In80% ases,theratiowasgreater than50. Thesmallestratioinourexperimentswas2.67,whihhappenedwhenposterior probabilitiesweredominatedbythepriorprobabilities.Inthat ase,eventhoughtheLW andSISalgorithms onvergedveryfast,theirMSEwasstillfarlargerthanthatofAIS-BN. Ournextexperimentaimedatshowinghow losetheAIS-BNalgorithm anapproah thebestpossiblesamplingresults. Ifweknowtheoptimalimportan esamplingfun tion, the onvergen eoftheAIS-BNalgorithmshouldbethesameasthatofforwardsampling withouteviden e. Inotherwords,theresultsoftheprobabilisti logi samplingalgorithm withouteviden eapproahthelimitofhowwellstohasti sampling anperform. Weran thelogi samplingalgorithmontheCPCSnetworkwithouteviden emimikingthetest runsoftheAIS-BNalgorithm,i.e., 5bloksof15runs,eahrepeated10timeswitha di�erentrandomnumberseed.Thenumberofsamplesgeneratedwasequaltotheaverage numberofsamplesgenerated bytheAIS-BNalgorithmforeah seriesof15 test runs. We obtained the average MSE � = 0:00057, with � = 0:000025, min = 0:00052, and 177
Figure9:TheratioofMSEbetweenSISandAIS-BNversusper entage. max = 0:00065. Thebest resultsshouldbearoundthisrange. From Table2, we an seethattheminimumMSEfortheAIS-BNalgorithmwas0:00049, withintherangeof theoptimalresult. ThemeanMSEinAIS-BNis0:00082, nottoofarfromtheoptimal results. Thestandarddeviation,�,issigni� antlylargerintheAIS-BNalgorithm,but thisisunderstandablegiventhatthepro essoflearningtheoptimalimportan efun tionis heuristi innature.ItisnotdiÆ ulttounderstandthatthereexistadi�eren ebetweenthe AIS-BNresultsandtheoptimalresults.First,theAIS-BNalgorithminourtestsupdated thesamplingdistributiononly10times, whihmaybetoofewtimesto letit onverge totheoptimalimportan edistribution. Se ond, evenifthealgorithmhas onvergedto theoptimalimportan edistribution,thesamplingalgorithmwillstilllettheparameter os illatearoundthisdistributionandtherewillbealwayssmalldi�eren esbetweenthetwo distributions. Figure 10 shows the onvergen e rate for all tested ases for a four-foldin rease in samplingtime (between 15 and 60 se onds). We adjustedthe onvergen e ratio of the AIS-BNalgorithmbydividingitbya onstant.A ordingtoEquation3,thetheoreti ally expe ted onvergen e ratio for a four-foldin rease inthe numberof samplesshouldbe around two. There are about 96% ases among the AIS-BN runs whose ratio lays in theinterval(1.75, 2.25℄, inasharp ontrastto11%and13% asesintheSISandLW algorithms.Theratiosoftheremaining4% asesinAIS-BNlayintheinterval[2.25,2.5℄. IntheSISandLWalgorithms,theper entageof aseswhoseratioweresmallerthan1.5 was71%and77%respe tively. Lessthan1.5meansthatthenumberofsampleswastoo smalltoestimatevarian eandtheresults annotbetrusted. Theratiogreaterthan2.25 178
Figure10: Thedistributionofthe onvergen eratiooftheAIS-BN,SIS,andLWalgorithmswhenthenumberofsamplesin reasesfourtimes. meanspossiblythat60se ondswaslongenoughtoestimatethevarian e,but15se onds wastooshort. 4.3 TheRoleofAIS-BNHeuristi sinPerforman eImprovement Fromtheaboveexperimentalresultswe anseethattheAIS-BNalgorithm animprove thesamplingperforman esigni� antly.Ournextseriesoftestsfo usedonstudyingtherole ofthetwoAIS-BNinitializationheuristi s.The�rstisinitializingtheICPTtablesofthe parentsofeviden etouniformdistributions,denotedbyU.These ondisadjustingsmall probabilities,denotedbyS.WedenoteAIS-BNwithoutanyheuristi initializationmethod tobetheAISalgorithm.AIS+U+SequalsAIS-BN.We omparedthefollowingversions ofthealgorithms:SIS,AIS,SIS+U,AIS+U,SIS+S,AIS+S,SIS+U+S,AIS+U+S.All algorithmswithSISusedthesamenumberofsamplesasSIS.AllalgorithmswithAISused thesamenumberofsamplesasAIS-BN.Wetestedthesealgorithmsonthesame75test asesusedinthepreviousexperiment. Figure11showstheMSEforeahofthesampling algorithmswiththesummarystatisti sinTable3.EventhoughtheAISalgorithmisbetter thantheSISalgorithm,thedi�eren eisnotaslargeasin aseoftheAIS+U,AIS+S,and AIS-BNalgorithms.Itseemsthatheuristi initializationmethodshelpmuh.Theresults fortheSIS+S;SIS+U,SIS+U+Salgorithmssuggestthatalthoughheuristi initialization methods animproveperforman e,theyalone annotimprovetoomuh. Itisfairtosay thatsigni� antperforman eimprovementintheAIS-BNalgorithmis omingfromthe ombinationofAISwithheuristi methods,notanymethodalone. ItisnotdiÆ ultto 179
Figure11: A omparisonofdi�erentalgorithmsintheCPCSnetwork. Eahbarisbased on75test ases. ThedottedbarshowstheMSEfortheSISalgorithmwhile thegraybarshowstheMSEfortheAISalgorithm. Table3: Summaryofthesimulationresultsfordi�erentalgorithmsintheCPCSnetwork. 4.4 ResultsforOtherNetworks InordertomakesurethattheAIS-BNalgorithmperformswellingeneral,wetestediton twootherlargenetworks. The�rstnetworkthatweusedinourtestsisthePathFindernetwork(Hekerman etal.,1990),whihisthe oreelementofanexpertsystemthatassistssurgi alpathologists 180