Show simple item record

dc.contributor.advisorDiakonikolas, Iliasen
dc.contributor.advisorSanthanam, Rahulen
dc.contributor.authorNikishkin, Vladimiren
dc.date.accessioned2017-11-02T15:22:33Z
dc.date.available2017-11-02T15:22:33Z
dc.date.issued2016-11-29
dc.identifier.urihttp://hdl.handle.net/1842/25406
dc.description.abstractIn this doctoral thesis we consider various property testing problems for structured distributions. A distribution is said to be structured if it belongs to a certain class which can be simply described in approximation terms. Such distributions often arise in practice, e.g. log-concave distributions, easily approximated by polynomials (see [Bir87a]), often appear in econometric research. For structured distributions, testing a property often requires far less samples than for general unrestricted distributions. In this thesis we prove that this is indeed the case for several distance-related properties. Namely, we give explicit sub-linear time algorithms for L1 and L2 distance testing between two structured distributions for the cases when either one or both of them are available as a “black box”. We also prove that the given algorithms have the best possible asymptotic complexity by proving matching lower bounds in the form of explicit problem instances (albeit constructed using randomized techniques) demanding at least a specified amount of data to be tested successfully. As the main numerical result, we prove that testing that total variation distance to an explicitly given distribution is at least e requires O(√k/e²) samples, where k is an approximation parameter, dependent on the class of distribution being tested and independent of the support size. Testing that the total variation distance between two “black box” distributions is at least e requires O(k⁴/⁵e⁶/⁵). In some cases, when k ~ n, this result may be worse than using an unrestricted testing algorithm (which requires O( n²/3/e² ) samples where n is the domain size). To address this issue, we develop a third algorithm, which requires O(k²/³e⁴/³ log⁴/³(n/k) log log(n/k)) and serves as a bridge between the cases of small and large domain sizes.en
dc.language.isoen
dc.publisherThe University of Edinburghen
dc.relation.hasversionI. Diakonikolas, D. M. Kane, and V. Nikishkin. Optimal algorithms and lower bounds for testing closeness of structured distributions. In 56th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2015, 2015.en
dc.relation.hasversionI. Diakonikolas, D. Kane, and V. Nikishkin. Testing identity of structured distributions. In SODA, pages 1841-1854, 2015.en
dc.subjectinformation theoryen
dc.subjectstatisticsen
dc.subjectproperty testingen
dc.titleAlgorithms and lower bounds for testing properties of structured distributionsen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record