1. Guiding search in automated theorem proving
- Author
-
Nikolić, Mladen S., Janičić, Predrag, Marić, Filip, and Ognjanović, Zoran
- Subjects
search ,istraživanje podataka ,koherentna logika ,automatsko dokazivanje teorema ,SAT rešavači ,SAT solvers ,automated theorem proving ,data mining ,coherent logic ,pretraga - Abstract
U ovom radu se razmatra problem usmeravanja pretrage u automatskom dokazivanju teorema. Rad se sastoji od dva dela čija je dodirna tačka CDCL sistem pretrage, koji se intenzivno koristi kod modernih SAT rešavača. U prvom delu rada razmatran je problem jednostavnog usmeravanja pretrage | izborom rešavača, njegovih heuristika i njihovih parametara, a u zavisnosti od svojstava instance koju je potrebno rešiti. Osnova predloženih metoda za izbor algoritama je sintaksna sličnost formula koja se odražava na njihovu grafovsku strukturu. Ova sličnost je prvi put pouzdano ustanovljena i analizirana pomoću originalne mere sličnosti grafova (koja se pokazala korisnom i u drugim domenima). Praktični pristupi merenju sličnosti formula se zbog računske efikasnosti ipak zasnivaju na numeričkim atributima iskaznih formula. Predložene su dve jednostavne metode izbora algoritma zasnovane na algoritmu k najbližih suseda. Prva tehnika, ArgoSmArT se zasniva na klasifikaciji instance u jednu od unapred zadatih familija za koje su poznati algoritmi koji ih efikasno rešavaju. Instanca se rešava algoritmom koji odgovara familiji u koju je instanca klasifikovana. Druga tehnika, ArgoSmArT k-NN se zasniva na nalaženju nekoliko sličnih instanci u trening skupu za koje je poznato vreme rešavanja pomoću svih algoritama kojima sistem raspolaže. Instanca se rešava algoritmom koji se najbolje ponaša na pronađenim instancama. Tehnika ArgoSmArT je pogodnija za izbor konfiguracije SAT rešavača, a ArgoSmArT k-NN za izbor samog SAT rešavača. Tehnika ArgoSmArT k-NN se pokazala značajno efikasnijom od najvažnijeg i pritom vrlo složenog sistema za izbor SAT rešavača i sistema SATzilla. Pored problema izbora KNF SAT rešavača i njihovih heuristika, razmatran je i problem izbora ne-KNF SAT rešavača u kojem fokus nije bio na tehnikama izbora rešavača, pošto se predložene tehnike direktno primenjuju i na taj problem, već na atributima kojima se ne-KNF instance mogu opisati, a koji do sad nisu predloženi. Rezultati u ovom domenu su pozitivni, ali za sada ograničeni. Osnovni razlog za to je nedostatak veceg broja ne-KNF resavaca raznovrsnog ponasanja, sto ne iznenaduje s obzirom da je ova vrsta resavaca tek u svom povoju. Pored konstrukcije ekasnog sistema za izbor SAT resavaca, prikazana je i metodologija poredenja SAT resavaca zasnovana na statistickom testiranju hipoteza. Potreba za ovakvom metodologijom proizilazi iz velike varijacije vremena resavanja jedne formule od strane jednog SAT resavaca, sto moze dovesti do razlicitih redosleda SAT resavaca prilikom poredenja njihovih performansi ili rangiranja, sto je i eksperimentalno demonstrirano. Predlozena metodologija pruza ocenu statisti cke znacajnosti testiranja i ocenu velicine efekta, poput verovatnoce da jedan SAT resavac bude brzi od drugog... In this thesis the problem of guiding search in automated theorem proving is considered. The thesis consists of two parts that have the CDCL search system, the system intensively used by modern SAT solvers, as their common topic. In the rst part of the thesis a simple approach to guiding search is considered | guiding by the selection of the solver, its heuristics, and their parameters, based on the properties of an instance to be solved. The basis of the proposed methods for algorithm selection is syntactical similarity of formulae which is re ected in their graph structure. This graph similarity is established and analyzed by using an original graph similarity measure (which turned out to be useful in other contexts, too). Yet, practical approaches to measuring similarity of formulae are based on their numerical features due to the computational complexity issues. Two simple methods for algorithm selection, based on k nearest neighbors, were proposed. The rst technique, ArgoSmArT is based on classication of instance in one of the predened families for which the ecient algorithms are known. The instance is solved by algorithm corresponding to the family to which the instance was classied. The second technique, ArgoSmArT k-NN is based on nding several similar instances in the training set for which the solving times by all considered algorithms are known. The instance is solved by the algorithm that behaves the best on those instances. ArgoSmArT technique is better suited for conguration selection of a SAT solver, and ArgoSmArT k-NN for SAT solver selection. ArgoSmArT k-NN technique showed to be more ecient than the most important and very complex system for SAT solver selection | SATzilla system. Apart from CNF SAT solver selection, the problem of non-CNF SAT solver selection is considered. The focus was not on solver selection techniques, since the proposed techniques are directly applicable, but on the attributes that can be used to describe non-CNF SAT instances, which have not been proposed earlier. The results in this domain are positive, but still limited. The main reason for that is the lack of greater number of non-CNF SAT solver of dierent behaviour, which is not surprising, having in mind that this kind of solvers is in its early stage of development. Apart from construction of ecient SAT solver selection system, the methodology of SAT solver comparison, based on statistical hypothesis testing is proposed. The need for such a methodology comes from great run time variations of single instance solving by a solver, which can result in dierent SAT solver orderings when one tries to compare their performance or rank them, as experimentally demonstrated. The proposed methodology gives the estimate of statistical signicance of the performed test and the estimate of the eect size, for instance the probability of a solver being faster than another...
- Published
- 2013