ecosmak.ru

Asymptotic notation for program execution time. Estimates from below, from above, asymptotically exact

480 rub. | 150 UAH | $7.5 ", MOUSEOFF, FGCOLOR, "#FFFFCC",BGCOLOR, "#393939");" onMouseOut="return nd();"> Dissertation - 480 RUR, delivery 10 minutes, around the clock, seven days a week and holidays

Kolodzey Alexander Vladimirovich. Asymptotic properties agreement criteria for testing hypotheses in a selection scheme without return, based on filling out cells in a generalized placement scheme: dissertation... candidate of physical and mathematical sciences: 01.01.05.- Moscow, 2006.- 110 pp.: ill. RSL OD, 61 07-1/496

Introduction

1 Entropy and information distance 36

1.1 Basic definitions and notations 36

1.2 Entropy of discrete distributions with limited mathematical expectation 39

1.3 Logarithmic generalized metric on a set of discrete distributions 43

1.4 Compactness of functions with a countable set of arguments. 46

1.5 Continuity of information distance Kullback - Leibler - Sanov 49

1.6 Conclusions 67

2 Probabilities of large deviations 68

2.1 Probabilities of large deviations of functions from the number of cells with a given filling 68

2.1.1 Local limit theorem 68

2.1.2 Integral limit theorem 70

2.1.3 Information distance and probabilities of large deviations of separable statistics 75

2.2 Probabilities of large deviations of separable statistics that do not satisfy the Cramer condition 81

2.3 Conclusions 90

3 Asymptotic properties of goodness-of-fit criteria 92

3.1 Consent criteria for selection without return design. 92

3.2 Asymptotic relative efficiency of goodness-of-fit criteria 94

3.3 Criteria based on the number of cells in generalized layouts 95

3.4 Conclusions 98

Conclusion 99

Literature 103

Introduction to the work

Object of research and relevance of the topic. In the theory of statistical analysis of discrete sequences, a special place is occupied by goodness-of-fit criteria for testing a possibly complex null hypothesis, which is that for a random sequence pQ)?=i such that

Хі Є Ім,і= 1,...,n, Ім = (о, і,..., M), for any і = 1,..., n, and for any k Є їm probability of event (Хі = k) does not depend on r. This means that the sequence (Хі)f =1 is in some sense stationary.

In a number applied problems As a sequence (X() =1, we consider the sequence of colors of balls when choosing without returning until exhaustion from an urn containing rik - 1 > 0 balls of color k, k Є їm - We will denote the set of such selections T(n 0 - 1, .. .,п/ - 1). Let the urn contain n - 1 balls in total, m n-l= (n fc -l).

Let us denote by r (k) _ r (fc) r (fc) the sequence of numbers of balls of color k in the sample. Consider the sequence h« = (^,...,)). M fc) =ri fc) , ^ = ^-^ = 2,...,^-1, _ (fc)

The sequence h^ is determined using the distances between the places of neighboring balls of color k in such a way that *Ф = n.

The set of sequences h(fc) for all k Є їм uniquely determines the sequence (Х()^ =1. Sequences h k for different k are dependent on each other. In particular, any of them is uniquely determined by all the others. If the cardinality of the set 1m is 2, then the sequence of colors of balls is uniquely determined by the sequence h() of distances between the places of neighboring balls of the same fixed color. Let there be N - 1 balls of color 0 in an urn containing n - 1 balls of two different colors. We can establish a one-to-one correspondence between the set M(N-l,n - N) and a set of 9\ Пі m vectors h(n, N) = (hi,..., /i#) with positive integer components such that

The set 9\n,m corresponds to the set of all distinct partitions of a positive integer n into N ordered terms.

By specifying a certain probability distribution on the set of vectors 9R n d, we obtain the corresponding probability distribution on the set Wl(N - l,n - N). The set V\n,y is a subset of the set 2J n,iv of vectors with non-negative integer components satisfying (0.1). In the dissertation work, distributions of the form will be considered as probability distributions on the set of vectors

P(%, N) = (r b..., r N)) = P(& = r„, u = 1,..., N\ & = n), (0.2) where 6 > , lg - independent non-negative integer random variables.

Distributions of the form (0.2) in /24/ are called generalized schemes for placing n particles in N cells. In particular, if the random variables b...,lr in (0.2) are distributed according to Poisson’s laws with parameters Ai,...,Alr, respectively, then the vector h(n,N) has a polynomial distribution with the probabilities of outcomes

Ri = t--~t~> ^ = 1,---,^-

Li + ... + l^

If the random variables i> >&v in (0.2) are identically distributed according to the geometric law V(Zi = k)= P k - 1 (l-p),k=l,2,..., where p is any in the interval 0

As noted in /14/,/38/, a special place in testing hypotheses about the distribution of frequency vectors h(n, N) = (hi,..., h^) in generalized schemes for placing n particles in N cells is occupied by the criteria constructed on the basis of statistics of the form ad%,lo) = L(i (o.z)

Фк «%,%..;$, (0.4) where /j/, v = 1,2,... and ф are some real-valued functions,

Mg = E 1(K = g), g = 0.1,.... 1/=1

The quantities // r in /27/ were called the number of cells containing exactly r particles.

Statistics of the form (0.3) in /30/ are called separable (additively separable) statistics. If the functions /„ in (0.3) do not depend on u, then such statistics were called in /31/ symmetric separable statistics.

For any r, the statistic /x r is a symmetric separable statistic. From equality

DM = DFg (0.5) it follows that the class of symmetric separable statistics of h u coincides with the class of linear functions of fi r. Moreover, the class of functions of the form (0.4) is wider than the class of symmetric separable statistics.

H 0 = (Rao(n,A0) is a sequence of simple null hypotheses that the distribution of the vector h(n,N) is (0.2), where the random variables i,...,ln and (0.2) are identically distributed and P(ti = k)=p k ,k = 0,l,2,..., parameters n, N change in the central region.

Consider some P Є (0,1) and a sequence, generally speaking, of complex alternatives n = (H(n,N)) such that there exists a n

P(fm > OpAR)) >: 0-We will reject the hypothesis Hq(ti,N) if fm > a s m((3). If there is a limit jim ~1nP(0l > a n, N (P)) = ШН ), where the probability for each N is calculated under the hypothesis #o(n,iV), then the value j (fi,lcl) is called in /38/ the index of the criterion φ at the point (/?,N). The last limit may, generally speaking, not exist. Therefore, in the dissertation work, in addition to the criterion index, the value lim (_IlnP(tor > a N (J3))) =if(P,P) is considered, which the author of the dissertation work, by analogy, called the subscript of the criterion φ at the point (/3,H) . Here and below, lim adg, lim а# jV-уо ЛГ-оо mean, respectively, the lower and upper limits of the sequence (odg) for N -> yu,

If a criterion index exists, then the criterion's subscript coincides with it. The lower index of the criterion always exists. How greater value criterion index (subscript of the criterion), the better the statistical criterion in the sense under consideration. In /38/ the problem of constructing agreement criteria for generalized layout schemes with highest value criterion index in the class of criteria that reject the hypothesis Ho(n,N) for where m > 0 is some fixed number, the sequence of constants is selected based on the given value of the power of the criterion for the sequence of alternatives, ft t is a real function of t + 1 arguments.

The criterion indices are determined by the probabilities of large deviations. As was shown in /38/, the rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of separable statistics when the Cramer condition for the random variable /() is satisfied is determined by the corresponding Kull-Bak-Leibler-Sanov information distance (the random variable q satisfies the Cramer condition , if for some # > 0 the moment generating function Me f7? is finite in the interval \t\

The question of the probabilities of large deviations of statistics from an unlimited number of fi r , as well as arbitrary separable statistics that do not satisfy the Cramer condition, remained open. This did not make it possible to finally solve the problem of constructing criteria for testing hypotheses in generalized placement schemes with the highest rate of tending to zero of the probability of a type I error with approaching alternatives in the class of criteria based on statistics of the form (0.4). The relevance of the dissertation research is determined by the need to complete the solution to the specified problem.

The purpose of the dissertation work is to construct agreement criteria with the highest value of the criterion index (subscript of the criterion) for testing hypotheses in the selection scheme without return in the class of criteria that reject the hypothesis U(n, N) for 0(iv"iv"-""" o """)>CiV " (0 " 7) where φ is a function of the countable number of arguments, and the parameters n, N change in the central region.

In accordance with the purpose of the study, the following tasks were set: to investigate the properties of entropy and information distance of Kull-Bak - Leibler - Sanov for discrete distributions with a countable number of outcomes; study the probabilities of large deviations of statistics of the form (0.4); study the probabilities of large deviations of symmetric separable statistics (0.3) that do not satisfy the Cramer condition; - find such statistics that the agreement criterion constructed on its basis for testing hypotheses in generalized placement schemes has the highest index value in the class of criteria of the form (0.7).

Scientific novelty: the concept of a generalized metric is given - a function that admits infinite values ​​and satisfies the axioms of identity, symmetry and triangle inequality. A generalized metric is found and sets are indicated on which the entropy and information distance functions, defined on a family of discrete distributions with a countable number of outcomes, are continuous in this metric; in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of statistics of the form (0.4), satisfying the corresponding form of the Cramer condition; in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition; in the class of criteria of the form (0.7), a criterion with the highest value of the criterion index is constructed.

Scientific and practical value. The work solves a number of questions about the behavior of the probabilities of large deviations in generalized placement schemes. The results obtained can be used in educational process in the specialties of mathematical statistics and information theory, in the study of statistical procedures for the analysis of discrete sequences and were used in /3/, /21/ in justifying the security of one class of information systems. Provisions put forward for defense: reducing the problem of testing the hypothesis from a single sequence of colors of balls from the fact that this sequence is obtained as a result of a choice without returning until the exhaustion of balls from an urn containing balls of two colors, and each such choice has the same probability, to the construction of agreement criteria to test hypotheses in the corresponding generalized layout; continuity of the entropy and Kullback-Leibler-Sanov information distance functions on an infinite-dimensional simplex with the introduced logarithmic generalized metric; a theorem on rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition in the generalized placement scheme in the semi-exponential case; a theorem on rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations for statistics of the form (0.4); - construction of a goodness-of-fit criterion for testing hypotheses in generalized layouts with the highest index value in the class of criteria of the form (0.7).

Approbation of work. The results were presented at seminars of the Department of Discrete Mathematics of the Mathematical Institute named after. V. A. Steklov RAS, information security department of ITM&VT named after. S. A. Lebedev RAS and at: the fifth All-Russian Symposium on Applied and Industrial Mathematics. Spring session, Kislovodsk, May 2 - 8, 2004; sixth International Petrozavodsk conference "Probabilistic methods in discrete mathematics" June 10 - 16, 2004; second International conference"Information systems and technologies (IST" 2004)", Minsk, November 8 - 10, 2004;

International conference "Modern Problems and new Trends in Probability Theory", Chernivtsi, Ukraine, June 19 - 26, 2005.

The main results of the work were used in the research work "Apology", carried out by ITMiVT RAS. S. A. Lebedev in the interests of the Federal Service for Technical and Export Control of the Russian Federation, and were included in the report on the implementation of the research stage /21/. Some results of the dissertation were included in the research report "Development of mathematical problems of cryptography" of the Academy of Cryptography of the Russian Federation for 2004 /22/.

The author expresses deep gratitude to the scientific supervisor, Doctor of Physical and Mathematical Sciences A. F. Ronzhin and the scientific consultant, Doctor of Physical and Mathematical Sciences, Senior Researcher A. V. Knyazev. The author expresses gratitude to Doctor of Physical and Mathematical Sciences, Professor A. M. Zubkov and Candidate of Physical and Mathematical Sciences Mathematical Sciences I. A. Kruglov for his attention to the work and a number of valuable comments.

Structure and content of the work.

The first chapter examines the properties of entropy and information distance for distributions on the set of non-negative integers.

In the first paragraph of the first chapter, notations are introduced and the necessary definitions are given. In particular, they are used following designations: x = (:ro,i, ---) - infinite-dimensional vector with a countable number of components;

Н(х) - -Ex^oXvlnx,; trunc m (x) = (x 0,x 1,...,x t,0,0,...); SI* = (x, x u > 0, u = 0.1,..., E~ o x„ 0,v = 0,l,...,E? =Q x v = 1); fi 7 = (x Є O, L 0 vx v = 7); %] = (хЄП,Эо»х и

16 mі = e o ** v \ &c = Ue>1 | 5 є Q 7) o

It is clear that the set Vt corresponds to a family of probability distributions on the set of non-negative integers, P 7 - to a family of probability distributions on the set of non-negative integers with mathematical expectation 7 - If y Є Q, then for є > 0 the set will be denoted by O e (y)

Оє(у) - (х eO,x v

In the second paragraph of the first chapter, a theorem on the boundedness of the entropy of discrete distributions with limited mathematical expectation is proved.

Theorem 1. On the boundedness of the entropy of discrete distributions with bounded mathematical expectation. For any reinforced concrete 7

If x Є fi 7 corresponds to a geometric distribution with a mathematical distribution 7; that is

7 x„ = (1- р)р\ v = 0.1,..., where р = --,

1 + 7 then the equality H(x) = F(1) holds.

The statement of the theorem can be viewed as the result of a formal application of Lagrange’s method of conditional multipliers in the case of an infinite number of variables. The theorem that the only distribution on the set (k, k + 1, k + 2,...) with a given mathematical expectation and maximum entropy is a geometric distribution with a given mathematical expectation is given (without proof) in /47/. The author, however, has given strict proof.

The third paragraph of the first chapter gives the definition of a generalized metric - a metric that allows infinite values.

For x,y Є Гі the function p(x,y) is defined as the minimum є > O with the property y v e~ e

If such an є does not exist, then it is assumed that p(x,y) = oo.

It is proved that the function p(x,y) is a generalized metric on the family of distributions on the set of non-negative integers, as well as on the entire set Ci*. Instead of e in the definition of the metric p(x,y), you can use any other positive number other than 1. The resulting metrics will differ by a multiplicative constant. Let us denote by J(x, y) the information distance

Here and below it is assumed that 0 In 0 = 0.01n ^ = 0. The information distance is defined for such x, y that x v - 0 for all and such that y v = 0. If this condition is not met, then we will assume J (S,y) = co. Let A C $1. Then we will denote J(Ay)="mU(x,y).

Let's put J(Jb,y) = 00.

In the fourth paragraph of the first chapter, the definition of compactness of functions defined on the set P* is given. The compactness of a function with a countable number of arguments means that with any degree of accuracy the value of the function can be approximated by the values ​​of this function at points where only a finite number of arguments are non-zero. The compactness of the entropy and information distance functions is proved.

For any 0

If for some 0 0 the function \(x) = J(x,p) is compact on the set 7 ] P O g (p).

The fifth paragraph of the first chapter discusses the properties of the information distance defined on an infinite-dimensional space. Compared to the finite-dimensional case, the situation with the continuity of the information distance function changes qualitatively. It is shown that the information distance function is not continuous on the set Г2 in any of the metrics pi(,y)= E|z„-i/„|, (

00 \ 2 p 2 (x,y) = sup (x^-ij^.

The validity of the following inequalities is proved for the entropy functions H(x) and information distance J(x,p):

1. For any x, x" Є fi \H(x) - H(x")\

2. If for some х,р є П there is є > 0 such that х є О є (р), then for any X і Є Q \J(x,p) - J(x",p)\

From these inequalities, taking into account Theorem 1, it follows that the entropy and information distance functions are uniformly continuous on the corresponding subsets fi in the metric p(x,y), namely,

For any 7 such that 0

If for some 7o, O

20 then for any 0 0 the function \p(x) = J(x t p) is uniformly continuous on the set 7 ] P O є (p) in the metric p(x,y).

A definition of non-extremal function is given. The non-extremal condition means that the function does not have local extrema, or the function takes the same values ​​at local minima (local maxima). The non-extrema condition weakens the requirement of the absence of local extrema. For example, the function sin x on the set of real numbers has local extrema, but satisfies the non-extremal condition.

Let for some 7 > 0, the region A is given by the condition

А = (хЄЇ1 1 ,ф(х) >а), (0.9) where Ф(х) is a real-valued function, а is some real constant, inf Ф(х)

And 3y, the question arose, n P „ under what conditions „a „ φ for i_ „ara- q meters n, N in the central region, ^ -> 7, for all their sufficiently large values ​​there will be such non-negative integers ko, k\, ..., k n, what ko + hi + ... + k n = N,

21 k\ + 2/... + nk n - N

Kq k\ k n . ^"iv"-"iv" 0 " 0 "-")>a -

It is proved that for this it is enough to require that the function φ be non-extremal, compact and continuous in the metric p(x,y), and also that for at least one point x satisfying (0.9), for some є > 0 there is a finite moment of degree 1 + є Ml + = і 1+є x and 0 for any u = 0.1,....

In the second chapter, we study the rough (up to logarithmic equivalence) asymptotics of the probability of large deviations of functions from D = (fio,..., cn, 0,...) - the number of cells with a given filling in the central region of variation of parameters N,n . Rough asymptotics of the probabilities of large deviations are sufficient to study the indices of the goodness-of-fit criteria.

Let the random variables ^ in (0.2) be identically distributed and

Р(Сі = к)=рьк = 0.1,... > P(z) - generating function of random variable i - converges in a circle of radius 1

22 Let us denote p(.) = (p(ad = o),P№) = i),...).

If there is a solution z 1 to the equation

M(*) = 7, then it is unique /38/. Throughout what follows we will assume that Pjfc>0,fc = 0,l,....

In the first paragraph of the first paragraph of the second chapter there is asymptotics of logarithms of probabilities of the form -m^1nP(th) = ^,...,/ = K)-

The following theorem is proved.

Theorem 2. Rough local theorem on the probabilities of large deviations. Let n, N -* co such that - ->7>0

The statement of the theorem follows directly from the formula for the joint distribution /to, A*b / in /26/ and the following estimate: if non-negative integer values ​​fii,fi2,/ satisfy the condition /I1 + 2// 2 + ... + 71/ = 71, then the number of non-zero values ​​among them is 0(l/n). This is a rough estimate and does not claim to be new. The number of non-zero τ in generalized layout schemes does not exceed the value of the maximum filling of cells, which in the central region, with a probability tending to 1, does not exceed the value 0(\n) /25/, /27/. Nevertheless, the resulting estimate 0(y/n) is satisfied with probability 1 and is sufficient to obtain rough asymptotics.

In the second paragraph of the first paragraph of the second chapter, the value of the limit is found where adg is a sequence of real numbers converging to some a Є R, φ(x) is a real-valued function. The following theorem is proved.

Theorem 3. Rough integral theorem on the probabilities of large deviations. Let the conditions of Theorem 2 be satisfied, for some r > 0, (> 0) the real function φ(x) is compact and uniformly continuous in the metric p on the set

A = 0 rH (p(r 1))nP bn] and satisfies the condition of non-extremality on the set Г2 7 . If for some constant a such that inf f(x)

24 there is a vector p a fi 7 P 0 r (p(z 7)); such that

Ф(ra) > а J(( (x) >а,хЄ П 7 ),р(2; 7)) = J(p a ,p(^y)), mo for any sequence а^ converging to а, ^-^\nP(f(^,^,...)>a m) = Pr a,p(r,)). (0.11)

With additional restrictions on the function φ(x), the information distance J(pa,P(zy)) in (2.3) can be calculated more specifically. Namely, the following theorem is true. Theorem 4. On information distance. Let for some 0

Whether some r > 0, C > 0, the real function φ(x) and its first-order partial derivatives are compact and uniformly continuous in the generalized metric p(x, y) on the set

A = O g (p)PP bn] , there exist T > 0, R > 0 such that for all \t\ O p v v 1+ z u exp(i--ph(x))

0(p(gaL)) = a, / h X v \Z,t) T, u= oX LJ (Z,t)

Then p(z a , t a) Є ft, u J((z Є Л,0(z) = а),р) = J(p(z a ,t a),p) d _ 9 = 7111 + t a «-^ OFaL)) - In 2Wexp( a --0(p(g a,i a))). j/=0 CnEi/ ^_o CX(/

If the function f(x) is a linear function, and the function fix) is defined using equality (0.5), then condition (0.12) turns into the Cramer condition for the random variable f(,(z)). Condition (0.13) is a form of condition (0.10) and is used to prove the presence in domains of the form (x Є Г2, φ(x) > a) of at least one point from 0(n, N) for all sufficiently large n, N.

Let v ()(n,iV) = (/гі,...,/ijv) be the frequency vector in the generalized layout (0.2). As a corollary of Theorems 3 and 4, the following theorem is formulated.

Theorem 5. Rough integral theorem on the probabilities of large deviations of symmetric separable statistics in a generalized placement scheme.

Let n, N -> co such that jfr - 7» 0 0,R > 0 such that for all \t\ Then for any sequence a# converging to a, 1 iv =

This theorem was first proven by A.F. Ronzhin in /38/ using the saddle point method.

In the second paragraph of the second chapter, the probabilities of large deviations of separable statistics in the generalized cxj^iax placement are studied in the case of failure to satisfy the Cramer condition for the random variable /((z)). Cramer's condition for the random variable f(,(z)) is not satisfied, in particular, if (z) is a Poisson random variable, and /(x) = x 2. Note that Cramer's condition for the separable statistics themselves in generalized allocation schemes is always satisfied, since for any fixed n, N the number possible outcomes in these schemes, of course.

As noted in /2/, if the Cramer condition is not satisfied, then to find the asymptotics of the probabilities of large deviations of sums of identically distributed random variables it is necessary to fulfill additional conditions for the correct change in the distribution of the term. The work (considers the case corresponding to the fulfillment of condition (3) in /2/, that is, the seven-exponential case. Let P(i = k) > O for all

28 k = 0.1,... and the function p(k) = -\nP(^ = k), can be continued to a function of continuous argument - a regularly varying function of order p, 0 oo P(tx) , r v P(t )

Let the function f(x) for sufficiently large values ​​of the argument be a positive, strictly increasing, regularly varying function of order d>1,^ On the rest of the number axis

Then s. V. /(i) has moments of any order and does not satisfy the Cramer condition, ip(x) = o(x) as x -> oo, and the following Theorem 6 is valid. Let the function ip(x) be monotonically nondecreasing for sufficiently large x, the function ^p does not increase monotonically, n, N --> oo so that jf - A, 0 b(z\), where b(z) = M/(1(2)), there is a limit l(n,lg)) > cN] = "(c ~ b(zx))l b""ї

It follows from Theorem b that if the Cramer condition is not satisfied, the limit (^ lim ~\nP(L N (h(n,N)) > cN) = 0, "" Dv

L/-too iV and which proves the validity of the hypothesis expressed in /39/. Thus, the value of the index of the agreement criterion in generalized placement schemes -^ when the Cramer condition is not met is always equal to zero. In this case, in the class of criteria, when Cramer’s condition is satisfied, criteria with a non-zero index value are constructed. From this we can conclude that using criteria whose statistics do not satisfy the Cramer condition, for example, the chi-square test in a polynomial scheme, to construct goodness-of-fit tests for testing hypotheses for non-converging alternatives in the indicated sense is asymptotically ineffective. A similar conclusion was made in /54/ based on the results of a comparison of chi-square and maximum likelihood ratio statistics in a polynomial scheme.

The third chapter solves the problem of constructing goodness-of-fit criteria with the largest value of the criterion index (the largest value of the subscript of the criterion) to test hypotheses in generalized placement schemes. Based on the results of the first and second chapters on the properties of the entropy functions, information distance and probabilities of large deviations, in the third chapter a function of the form (0.4) is found such that the goodness-of-fit criterion constructed on its basis has the largest value of the exact subscript in the class of criteria under consideration. The following theorem is proved. Theorem 7. On the existence of an index. Let the conditions of Theorem 3 be satisfied, 0 ,... - a sequence of alternative distributions, 0^(/3, iV) - the maximum number for which, under the hypothesis Н Р (lo, the inequality

P(φ(^^,...)>a φ (P,M))>(3, there is a limit limjv-»oo o>φ(P, N) - a. Then at the point (/3, N) there is a criterion index f

Zff,K) = 3((φ(x) >a,xe ZD.P^)).

In this case, zf(0,th)N NP(e(2 7) = fc)"

The Conclusion sets out the results obtained in their relationship with the general goal and specific tasks posed in the dissertation, formulates conclusions based on the results of the dissertation research, indicates the scientific novelty, theoretical and practical value of the work, as well as specific scientific tasks identified by the author and the solution of which seems relevant .

Short review literature on the research topic.

The thesis examines the problem of constructing agreement criteria in generalized placement schemes with the highest value of the criterion index in the class of functions of the form (0.4) with non-converging alternatives.

Generalized layout schemes were introduced by V.F. Kolchin in /24/. The quantities fi r in the polynomial scheme were called the number of cells with r pellets and were studied in detail in the monograph by V. F. Kolchin, B. A. Sevastyanov, V. P. Chistyakov /27/. The values ​​of \i r in generalized layouts were studied by V.F. Kolchin in /25/, /26/. Statistics of the form (0.3) were first considered by Yu. I. Medvedev in /30/ and were called separable (additively separable) statistics. If the functions /„ in (0.3) do not depend on u, such statistics were called in /31/ symmetric separable statistics. The asymptotic behavior of the moments of separable statistics in generalized allocation schemes was obtained by G. I. Ivchenko in /9/. Limit theorems for a generalized layout scheme were also considered in /23/. Reviews of the results of limit theorems and agreement criteria in discrete probabilistic schemes of type (0.2) were given by V. A. Ivanov, G. I. Ivchenko, Yu. I. Medvedev in /8/ and G. I. Ivchenko, Yu. I. Medvedev , A.F. Ronzhin in /14/. Agreement criteria for generalized layouts were considered by A.F. Ronzhin in /38/.

A comparison of the properties of statistical criteria in these works was carried out from the point of view of relative asymptotic efficiency. The case of converging (contigual) hypotheses was considered - efficiency in the sense of Pitman and non-converging hypotheses - efficiency in the sense of Bahadur, Hodges - Lehman and Chernov. Connection between various types the relative effectiveness of statistical tests is discussed, for example, in /49/. As follows from the results of Yu. I. Medvedev in /31/ on the distribution of separable statistics in a polynomial scheme, the criterion based on the chi-square statistic has the greatest asymptotic power under convergent hypotheses in the class of separable statistics on the frequencies of outcomes in a polynomial scheme. This result was generalized by A.F. Ronzhin for circuits of type (0.2) in /38/. I. I. Viktorova and V. P. Chistyakov in /4/ constructed an optimal criterion for a polynomial scheme in the class of linear functions of fi r. A.F. Ronzhin in /38/ constructed a criterion that, given a sequence of alternatives that are not close to the null hypothesis, minimizes the logarithmic rate at which the probability of an error of the first kind tends to zero, in the class of statistics of the form (0.6). A comparison of the relative performance of the chi-square and maximum likelihood ratio statistics under approaching and non-approximating hypotheses was carried out in /54/. The thesis considered the case of non-converging hypotheses. Studying the relative statistical effectiveness of criteria under non-converging hypotheses requires studying the probabilities of extremely large deviations - of the order of 0(u/n). For the first time, such a problem for a polynomial distribution with a fixed number of outcomes was solved by I. N. Sanov in /40/. The asymptotic optimality of goodness-of-fit tests for testing simple and complex hypotheses for a multinomial distribution in the case of a finite number of outcomes with non-converging alternatives was considered in /48/. The properties of information distance were previously considered by Kullback, Leibler /29/,/53/ and I. II. Sanov /40/, as well as Hoeffding /48/. In these works, the continuity of information distance was considered on finite-dimensional spaces in the Euclidean metric. A number of authors considered a sequence of spaces with increasing dimension, for example, in the work of Yu. V. Prokhorov /37/ or in the work of V. I. Bogachev, A. V. Kolesnikov /1/. Rough (up to logarithmic equivalence) theorems on the probabilities of large deviations of separable statistics in generalized placement schemes under the Cramer condition were obtained by A.F. Roizhin in /38/. A. N. Timashev in /42/,/43/ obtained exact (up to equivalence) multidimensional integral and local limit theorems on the probabilities of large deviations of the vector fir^n, N),..., fi rs (n,N) , where s, gi,..., r s are fixed integers,

Statistical problems of testing hypotheses and estimating parameters in a selection scheme without return in a slightly different formulation were considered by G. I. Ivchenko, V. V. Levin, E. E. Timonina /10/, /15/, where estimation problems were solved for a finite population, when the number of its elements is an unknown quantity, the asymptotic normality of multivariate S - statistics from s independent samples in a selection scheme without reversion was proved. The problem of studying random variables associated with repetitions in sequences of independent trials was studied by A. M. Zubkov, V. G. Mikhailov, A. M. Shoitov in /6/, /7/, /32/, /33/, / 34/. Analysis of the main statistical problems of estimation and testing of hypotheses within the framework of general model Markova-Polya was carried out by G.I. Ivchenko, Yu.I. Medvedev in /13/, the probabilistic analysis of which was given in /11/. A method for specifying non-uniform probability measures on a set of combinatorial objects, which is not reducible to the generalized placement scheme (0.2), was described in G. I. Ivchenko, Yu. I. Medvedev /12/. A number of problems in probability theory, in which the answer can be obtained as a result of calculations using recurrent formulas, are indicated by A. M. Zubkov in /5/.

Inequalities for the entropy of discrete distributions were obtained in /50/ (cited from the abstract of A. M. Zubkov in RZhMat). If (p n )Lo is a probability distribution,

Рп = Е Рк, к=п A = supp^Pn+i

I + (In -f-) (X Rn - R n+1)

Рп= (x f 1)n+v n>Q. (0.15)

Note that the extremal distribution (0.15) is a geometric distribution with mathematical expectation A, and the function F(X) of parameter (0.14) coincides with the function of the mathematical expectation in Theorem 1.

Entropy of discrete distributions with bounded mathematical expectation

If a criterion index exists, then the criterion's subscript coincides with it. The lower index of the criterion always exists. The higher the value of the criterion index (subscript of the criterion), the better the statistical criterion in this sense. In /38/, the problem of constructing agreement criteria for generalized layouts with the highest value of the criterion index in the class of criteria that reject the hypothesis Ho(n,N) was solved for where m 0 is some fixed number, the sequence of constant units is selected based on the given value power of the criterion for a sequence of alternatives, ft - real function of m + 1 arguments.

The criterion indices are determined by the probabilities of large deviations. As was shown in /38/, the rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of separable statistics when the Cramer condition for the random variable /() is satisfied is determined by the corresponding Kull-Bak-Leibler-Sanov information distance (the random variable q satisfies the Cramer condition , if for some # 0 the generating function of the moments Mef7? is finite in the interval \t\ H /28/).

The question of the probabilities of large deviations of statistics from an unlimited number of fir, as well as arbitrary separable statistics that do not satisfy the Cramer condition, remained open. This did not make it possible to finally solve the problem of constructing criteria for testing hypotheses in generalized placement schemes with the highest rate of tending to zero of the probability of a type I error with approaching alternatives in the class of criteria based on statistics of the form (0.4). The relevance of the dissertation research is determined by the need to complete the solution to the specified problem.

The purpose of the dissertation work is to construct agreement criteria with the largest value of the criterion index (subscript of the criterion) for testing hypotheses in a selection scheme without return in the class of criteria that reject the hypothesis U(n, N) for where φ is a function of the countable number of arguments, and parameters n, N change in the central region. In accordance with the purpose of the study, the following tasks were set: - to study the properties of entropy and information distance of Kull-Bak - Leibler - Sanov for discrete distributions with a countable number of outcomes; - study the probabilities of large deviations of statistics of the form (0.4); - study the probabilities of large deviations of symmetric separable statistics (0.3) that do not satisfy the Cramer condition; - find such statistics that the agreement criterion constructed on its basis for testing hypotheses in generalized placement schemes has the highest index value in the class of criteria of the form (0.7). Scientific novelty: - the concept of a generalized metric is given - a function that admits infinite values ​​and satisfies the axioms of identity, symmetry and triangle inequality. A generalized metric is found and sets are indicated on which the entropy and information distance functions, defined on a family of discrete distributions with a countable number of outcomes, are continuous in this metric; - in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of statistics of the form (0.4), satisfying the corresponding form of the Cramer condition; - in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition; - in the class of criteria of the form (0.7), a criterion with the highest value of the criterion index is constructed. Scientific and practical value. The work solves a number of questions about the behavior of the probabilities of large deviations in generalized placement schemes. The results obtained can be used in the educational process in the specialties of mathematical statistics and information theory, in the study of statistical procedures for the analysis of discrete sequences, and were used in /3/, /21/ to justify the security of one class of information systems. Provisions submitted for defense: - reduction of the problem of testing the hypothesis from a single sequence of colors of balls from the fact that this sequence is obtained as a result of a choice without returning until the exhaustion of balls from an urn containing balls of two colors, and each such choice has the same probability, to the construction of criteria agreement to test hypotheses in the appropriate generalized layout; - continuity of the entropy and Kullback-Leibler-Sanov information distance functions on an infinite-dimensional simplex with the introduced logarithmic generalized metric; - a theorem on rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition in the generalized placement scheme in the semi-exponential case;

Continuity of Kullback - Leibler - Sanov information distance

Generalized layout schemes were introduced by V.F. Kolchin in /24/. The quantities fir in the polynomial scheme were called the number of cells with r pellets and were studied in detail in the monograph by V. F. Kolchin, B. A. Sevastyanov, V. P. Chistyakov /27/. The values ​​of \іr in generalized layouts were studied by V.F. Kolchin in /25/,/26/. Statistics of the form (0.3) were first considered by Yu. I. Medvedev in /30/ and were called separable (additively separable) statistics. If the functions /„ in (0.3) do not depend on u, such statistics were called in /31/ symmetric separable statistics. The asymptotic behavior of the moments of separable statistics in generalized allocation schemes was obtained by G. I. Ivchenko in /9/. Limit theorems for a generalized layout scheme were also considered in /23/. Reviews of the results of limit theorems and agreement criteria in discrete probabilistic schemes of type (0.2) were given by V. A. Ivanov, G. I. Ivchenko, Yu. I. Medvedev in /8/ and G. I. Ivchenko, Yu. I. Medvedev , A.F. Ronzhin in /14/. Agreement criteria for generalized layouts were considered by A.F. Ronzhin in /38/.

A comparison of the properties of statistical criteria in these works was carried out from the point of view of relative asymptotic efficiency. The case of converging (contigual) hypotheses was considered - efficiency in the sense of Pitman and non-converging hypotheses - efficiency in the sense of Bahadur, Hodges - Lehman and Chernov. The relationship between different types of relative performance statistical tests is discussed, for example, in /49/. As follows from the results of Yu. I. Medvedev in /31/ on the distribution of separable statistics in a polynomial scheme, the greatest asymptotic power under convergent hypotheses in the class of separable statistics on the frequencies of outcomes in a polynomial scheme has a criterion based on the chi-square statistic. This result was generalized by A.F. Ronzhin for circuits of type (0.2) in /38/. I. I. Viktorova and V. P. Chistyakov in /4/ constructed an optimal criterion for a polynomial scheme in the class of linear functions of fir. A.F. Ronzhin in /38/ constructed a criterion that, given a sequence of alternatives that are not close to the null hypothesis, minimizes the logarithmic rate at which the probability of an error of the first kind tends to zero, in the class of statistics of the form (0.6). A comparison of the relative performance of the chi-square and maximum likelihood ratio statistics under approaching and non-approximating hypotheses was carried out in /54/. The thesis considered the case of non-converging hypotheses. Studying the relative statistical effectiveness of criteria under non-converging hypotheses requires studying the probabilities of extremely large deviations - of the order of 0(u/n). For the first time, such a problem for a polynomial distribution with a fixed number of outcomes was solved by I. N. Sanov in /40/. The asymptotic optimality of goodness-of-fit tests for testing simple and complex hypotheses for a multinomial distribution in the case of a finite number of outcomes with non-converging alternatives was considered in /48/. The properties of information distance were previously considered by Kullback, Leibler /29/,/53/ and I. II. Sanov /40/, as well as Hoeffding /48/. In these works, the continuity of information distance was considered on finite-dimensional spaces in the Euclidean metric. A number of authors considered a sequence of spaces with increasing dimension, for example, in the work of Yu. V. Prokhorov /37/ or in the work of V. I. Bogachev, A. V. Kolesnikov /1/. Rough (up to logarithmic equivalence) theorems on the probabilities of large deviations of separable statistics in generalized placement schemes under the Cramer condition were obtained by A. F. Roizhin in /38/. A. N. Timashev in /42/,/43/ obtained exact (up to equivalence) multidimensional integral and local limit theorems on the probabilities of large deviations of a vector

The study of the probabilities of large deviations when the Cramer condition is not met for the case of independent random variables was carried out in the works of A. V. Nagaev /35/. The method of conjugate distributions is described by Feller /45/.

Statistical problems of testing hypotheses and estimating parameters in a selection scheme without return in a slightly different formulation were considered by G. I. Ivchenko, V. V. Levin, E. E. Timonina /10/, /15/, where estimation problems were solved for a finite population, when the number of its elements is an unknown quantity, the asymptotic normality of multivariate S - statistics from s independent samples in a selection scheme without reversion was proved. The problem of studying random variables associated with repetitions in sequences of independent trials was studied by A. M. Zubkov, V. G. Mikhailov, A. M. Shoitov in /6/, /7/, /32/, /33/, /34/ . An analysis of the main statistical problems of estimation and testing of hypotheses within the framework of the general Markov-Pólya model was carried out by G. I. Ivchenko, Yu. I. Medvedev in /13/, a probabilistic analysis of which was given in /11/. A method for specifying non-uniform probability measures on a set of combinatorial objects, which is not reducible to the generalized placement scheme (0.2), was described in G. I. Ivchenko, Yu. I. Medvedev /12/. A number of problems in probability theory, in which the answer can be obtained as a result of calculations using recurrent formulas, are indicated by A. M. Zubkov in /5/.

Information distance and large deviation probabilities of separable statistics

When Cramer's condition is not satisfied, large deviations of separable statistics in the generalized placement scheme in the considered seven-exponential case are determined by the probability of deviation of one independent term. When Cramer's condition is satisfied, this, as emphasized in /39/, is not the case. Remark 10. The function φ(x) is such that the mathematical expectation of Its АН) is finite for 0 t 1 and infinite for t 1. Remark 11. For separable statistics that do not satisfy the Cramer condition, the limit (2.14) is equal to 0, which proves the validity of the hypothesis , expressed in /39/. Remark 12. For the chi-square statistic in a polynomial scheme for n, ./V - co so that - A, it immediately follows from the theorem that This result was obtained in /54/ directly. In this chapter, in the central region of changes in the parameters of generalized particle placement schemes in cells, rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of additively separable statistics from the number of cells and functions from the number of cells with a given filling were found.

If Cramer's condition is satisfied, then the rough asymptotics of the probabilities of large deviations is determined by the rough asymptotics of the probabilities of getting into a sequence of points with rational coordinates, converging in the above sense to the point at which the extremum of the corresponding information distance is reached.

The seven-exponential case of non-fulfillment of Cramer's condition for the random variables f(i),..., f(n) was considered, where b, kr are independent random variables generating the generalized decomposition scheme (0.2), f(k) is a function in definition of symmetric additively separable statistics in (0.3). That is, it was assumed that the functions p(k) = - lnP(i = k) and f(k) can be extended to regularly varying functions of a continuous argument of the order p 0 and q 0, respectively, and p q. It turned out that the main contribution to the rough asymptotics of the probabilities of large deviations of separable statistics in generalized placement schemes is similarly made by the rough asymptotics of the probability of ionization in the corresponding sequence of points. It is interesting to note that previously the theorem on the probabilities of large deviations for separable statistics was proved using the saddle point method, with the main contribution to the asymptotics being made by a single saddle point. The case where, if the Cramer condition is not met, the 2-kN condition is not satisfied remains unexplored.

If Cramer's condition is not satisfied, then the specified condition may not be satisfied only in the case of p 1. As directly follows from the logarithm of the corresponding probabilities, for the Poisson distribution and the geometric distribution p = 1. From the result on the asymptotics of the probabilities of large deviations when the Cramer condition is not met, we can conclude that the criteria whose statistics do not satisfy the Cramer condition have a significantly lower rate of tendency towards zero of the probabilities of errors of the second type with a fixed probability of an error of the first kind and non-converging alternatives compared to the criteria whose statistics satisfy the Cramer condition. Let a selection be made from an urn containing N - 1 1 white ip-JV 1 black balls without returning until complete exhaustion. We connect the places of white balls in the choice 1 i\ ... r -i n - 1 with the sequence of distances between neighboring white balls hi,..., h as follows: Then hv l,v =1,... ,N,M EjLi i/ - n- Let us define a probability distribution on the set of vectors h = (hi,...,Lg) by setting V(hv = rv,v = l,...,N) where i,...,lg - independent non-negative integer random variables (r.v.), that is, consider the generalized allocation scheme (0.2). The distribution of the vector h depends on n,N, but the corresponding indices will be omitted where possible to simplify notation. Remark 14. If each of (]) ways of selecting balls from an urn is assigned the same probability ( \) mn for any r i,..., rg such that r„ 1,u = l,...,N ,T,v=\ru = n, the probability that the distances between adjacent white balls in the choice will take these values

Criteria based on the number of cells in general layouts

The purpose of the dissertation work was to construct goodness-of-fit criteria for testing hypotheses in a selection scheme without returning from an urn containing balls of 2 colors. The author decided to study statistics based on the frequencies of distances between balls of the same color. In this formulation, the problem was reduced to the task of testing hypotheses in a suitable generalized layout.

The dissertation work included: the properties of entropy and information distance of discrete distributions with an unlimited number of outcomes with a limited mathematical expectation; - a rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of a wide class of statistics in a generalized placement scheme was obtained; - based on the results obtained, a criterion function with the highest logarithmic rate of tending to zero of the probability of an error of the first kind with a fixed probability of an error of the second kind and non-converging alternatives was constructed; - it has been proven that statistics that do not satisfy the Cramer condition have a lower rate of convergence to zero of the probabilities of large deviations compared to statistics that satisfy this condition. The scientific novelty of the work is as follows. - the concept of a generalized metric is given - a function that admits infinite values ​​and satisfies the axioms of identity, symmetry and triangle inequality. A generalized metric is found and sets are indicated on which the entropy and information distance functions, defined on a family of discrete distributions with a countable number of outcomes, are continuous in this metric; - in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of statistics of the form (0.4), satisfying the corresponding form of the Cramer condition; - in the generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition; - in the class of criteria of the form (0.7), a criterion with the highest value of the criterion index is constructed. The work solves a number of questions about the behavior of the probabilities of large deviations in generalized placement schemes. The results obtained can be used in the educational process in the specialties of mathematical statistics and information theory, in the study of statistical procedures for the analysis of discrete sequences, and were used in /3/, /21/ to justify the security of one class of information systems. However, a number of questions remain open. The author limited himself to considering the central zone of change parameters n,N generalized schemes for placing n particles in /V cells. If the carrier of the distribution of random variables generating the generalized arrangement scheme (0.2) is not a set of the form r, r 4-1, r + 2,..., then when proving the continuity of the information distance function and studying the probabilities of large deviations, it is necessary to take into account the arithmetic structure of such carrier, which was not considered in the author’s work. For the practical application of criteria built on the basis of the proposed function with the maximum index value, it is necessary to study its distribution both under the null hypothesis and under alternatives, including converging ones. It is also of interest to transfer the developed methods and generalize the results obtained to other probabilistic schemes other than generalized placement schemes. If //1,/ 2,-.. are the frequencies of distances between the numbers of outcome 0 in a binomial scheme with probabilities of outcomes swarm 1 -POj, then it can be shown that in this case, from the analysis of the formula for the joint distribution of values ​​\іт in a generalized placement scheme, proved in /26/, it follows that distribution (3.3), generally speaking, cannot be represented in the general case as a joint distribution of values ​​of cg in any generalized scheme for placing particles in cells. This distribution is a special case of distributions on the set of combinatorial objects introduced in /12/. It seems an urgent task to transfer the results of the dissertation work for generalized placement schemes to this case, which was discussed in /52/.

To describe asymptotic estimates there is a notation system:

§ They say that f(n)= O(g(n)), if there is a constant c>0 and a number n0 such that the condition 0≤f(n)≤c*g(n) is satisfied for all n≥n0. More formally:

(()) { () | 0, } 0 0 O g n= fn$c> $n"n> n£ fn£ cg n

O(g(n)) is used to indicate functions that are no more than a constant number of times greater than g(n), this variant is used to describe upper bounds (in the sense of “no worse than”). When we are talking about a specific algorithm for solving a specific problem, the goal of analyzing the time complexity of this algorithm is to obtain an estimate for the time at worst or on average, usually asymptotic estimate above O(g(n)), and, if possible, an asymptotically lower estimate for W(g(n)), and even better, an asymptotically exact estimate for Q(g(n)).

But the question remains: could there be even better solution algorithms for this problem? This question poses the problem of finding a lower estimate of the time complexity for the problem itself (for all possible algorithms for solving it, and not for one of the known algorithms for solving it). The issue of obtaining nontrivial lower bounds is very difficult. To date, there are not many such results, but non-trivial lower bounds have been proven for some limited computer models, and some of them play an important role in practical programming. One of the problems for which a lower bound for time complexity is known is the sorting problem:

§ Given a sequence of n elements a1,a2,... an, selected from the set on which the linear order is specified.

§ It is required to find a permutation p of these n elements that will map the given sequence into a non-decreasing sequence ap(1),ap(2),... ap(n), i.e. ap(i)≤ap(i+1) for 1≤i mixing method . Let us have two problems A and B, which are related in such a way that problem A can be solved as follows:

1) The source data for task A is converted into the corresponding source data

data for task B.

2) Problem B is being solved.

3) The result of solving problem B is converted into the correct solution to problem A.__ In this case we say that task A reducible to the problem B. If steps (1) and (3) above can be completed in time O(t(n)), where, as usual, n is the 25 “volume” of task A, then we say that A t (n)-reducible to B, and write it like this: A μt (n) B. Generally speaking, reducibility is not a symmetric relation; in the special case when A and B are mutually reducible, we call them equivalent. The following two self-evident statements characterize the power of the reduction method under the assumption that this reduction preserves the order of the “scope” of the problem.

"O" big And "o" small( and ) - mathematical notations for comparing the asymptotic behavior of functions. They are used in various branches of mathematics, but most actively in mathematical analysis, number theory and combinatorics, as well as in computer science and the theory of algorithms.

, « O small of " means "infinitesimal relative to " [, a negligible quantity when considered. The meaning of the term “O big” depends on its field of application, but always grows no faster than, “ O large from "(exact definitions are given below).

In particular:

Continued 7

the phrase “the complexity of the algorithm is” means that with an increase in the parameter characterizing the amount of input information of the algorithm, the operating time of the algorithm cannot be limited to a value that grows more slowly than n!;

the phrase “the function is “about” small of the function in the neighborhood of the point” means that as k approaches it decreases faster than (the ratio tends to zero).

Sum Rule: Let a finite set M be divided into two disjoint subsets M 1 and M 2 (in union giving the entire set M). Then the power |M| = |M 1 | + |M 2 |.

Product rule: Let object a in a certain set be selected in n ways, and after that (that is, after choosing object a) object b can be selected in m ways. Then the object ab can be selected in n*m ways.

Comment: Both rules allow inductive generalization. If a finite set M admits a partition into r pairwise disjoint subsets M 1 , M 2 ,…,M r , then the cardinality |M| = |M 1 |+|M 2 |+…+|M r |. If object A 1 can be selected in k 1 ways, then (after object A 1 is selected) object A 2 can be selected in k 2 ways, and so on and finally, object AR can be selected in k ways, then object A 1 A 2 ... And r can be chosen in k 1 k 2 …k r ways.

asymptotically optimal

  • - a concept that states that the estimate is unbiased in the limit. Let be a sequence of random variables on a probability space, where R is one of the measures of the family...

    Mathematical Encyclopedia

  • - a concept that asserts the unbiasedness of the criterion in the limit...

    Mathematical Encyclopedia

  • - a solution to a differential system that is Lyapunov stable and attracts all other solutions with sufficiently close initial values...

    Mathematical Encyclopedia

  • - a concept that extends the idea of ​​efficient estimation to the case of large samples. An unambiguous definition of A. e. O. does not have. For example, in the classic option we are talking about asymptotic...

    Mathematical Encyclopedia

  • - desirable, expedient...

    Reference commercial dictionary

  • - 1. best, most favorable, most appropriate to certain conditions and tasks 2...

    Large economic dictionary

  • - the most favorable, the best possible...

    Great Soviet Encyclopedia

  • - the best, most appropriate for certain conditions and tasks...

    Modern encyclopedia

  • - the best, most appropriate for certain conditions and tasks...

    Large encyclopedic dictionary

  • - ...
  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

  • - ...

    Spelling dictionary-reference book

"asymptotically optimal" in books

Optimal Visual Contrast (OVC)

From the book Color and Contrast. Technology and creative choice author Zheleznyakov Valentin Nikolaevich

Optimal Visual Contrast (OVC) Imagine a black suit illuminated by the sun and a white shirt illuminated by the moon. If we measure their brightness with an instrument, it turns out that under these conditions a black suit is many times brighter than a white shirt, and yet we know that

What is the optimal scale?

From the book Twitonomics. Everything you need to know about economics, short and to the point by Compton Nick

What is the optimal scale? The author of the concept of optimal scale is the German-British philosopher Fritz Schumacher, author of the book “Less is Better: Economics as Human Essence.” He said that the capitalist tendency towards “gigantism” is not only

8.4.2. Optimal growth path

From the book Economic Theory: Textbook author Makhovikova Galina Afanasyevna

8.4.2. Optimal growth path Let us assume that resource prices remain unchanged, while the enterprise budget is constantly growing. By connecting the tangent points of isoquants with isocosts, we get line 0G - “development path” (growth path). This line shows the growth rate of the ratio

The best option

From the book USSR: from ruin to world power. Soviet breakthrough by Boffa Giuseppe

Optimal option In the fire of battles in 1928, the first five-year plan was born. Beginning in 1926, two institutions, Gosplan and VSNKh, prepared various draft plans one after another. Their development was accompanied by continuous discussions. As one scheme

OPTIMAL OPTION

From the book Russian Rock. Small encyclopedia author Bushueva Svetlana

Optimal

From the book Great Soviet Encyclopedia (OP) by the author TSB

Optimal order

From the book CSS3 for Web Designers by Siderholm Dan

Optimal Order When using browser prefixes, it is important to be mindful of the order in which properties are listed. You may notice that in the previous example the prefix properties are written first, followed by the unprefixed property. Why put the genuine

Optimal person

From the book Computerra Magazine No. 40 dated October 31, 2006 author Computerra Magazine

An optimal person Author: Vladimir Guriev Some topics that were popular some forty years ago today seem so marginal that they are almost not discussed seriously. At the same time - judging by the tone of the articles in popular magazines - they seemed relevant and even

The best option

From the book Stalin's First Strike 1941 [Collection] author Kremlev Sergey

Optimal option Analysis of possible scenarios for the development of events inevitably makes one think about choosing the optimal option. It cannot be said that the various “summer” options, that is, alternatives tied to May-June - July 1941, inspire optimism. No, they

The best option

From the book The Great Patriotic Alternative author Isaev Alexey Valerievich

Optimal option Analysis of possible scenarios for the development of events inevitably makes one think about choosing the optimal option. It cannot be said that the various “summer” options, i.e. alternatives tied to May - June - July 1941, inspire optimism. No, they

Optimal control

From the book Self-esteem in children and adolescents. Book for parents by Eyestad Gyru

Optimal control What does it mean to hold moderately tightly? You must determine this yourself, based on your knowledge of your own child and the conditions of the environment in which you live. In most cases, parents of teenagers try to protect their children from smoking, drinking alcohol,

Optimal way

From the book The Perfectionist Paradox by Ben-Shahar Tal

The Optimal Path We are constantly bombarded by perfection. Adonis graces the cover of Men’s Health, Elena the Beautiful graces the cover of Vogue; women and men on the vast screen, in an hour or two, resolve their conflicts, act out an ideal plot, give themselves to ideal love. We've all heard

Optimal approach

From the book Expert No. 07 (2013) author's Expert Magazine

Optimal approach Sergey Kostyaev, candidate of political sciences, senior researcher at INION RAS The US Department of Defense spent a billion dollars on a non-working computer program Photo: EPA From March 1, Pentagon spending is likely to be reduced by 43 billion

The best option

From the book Two Seasons author Arsenyev L

Optimal option - Tell me, is it wise to play on several fronts at once? - journalists asked Bazilevich and Lobanovsky at the very beginning of the ’75 season. “It’s unreasonable, of course,” they answered. - But it is necessary. We believe that it is imperative to differentiate the significance

Optimal control

From the book Managing Personal (Family) Finances. Systems approach author Steinbock Mikhail

Optimal control >> With optimal control, we divide all costs into two large groups: – “routine” – regular expenses, – one-time or non-standard expenses. Optimal control can only be used after several months of detailed control.

1 Entropy and information distance

1.1 Basic definitions and notations.

1.2 Entropy of discrete distributions with limited mathematical expectation.

1.3 Logarithmic generalized metric on a set of discrete distributions.

1.4 Compactness of functions with a countable set of arguments

1.5 Continuity of information distance Kullback - Leibler - Sanov

1.6 Conclusions.

2 Probabilities of large deviations

2.1 Probabilities of large deviations of functions from the number of cells with a given filling.

2.1.1 Local limit theorem.

2.1.2 Integral limit theorem.

2.1.3 Information distance and probabilities of large deviations of separable statistics

2.2 Probabilities of large deviations of separable statistics that do not satisfy the Cramer condition.

2.3 Conclusions.

3 Asymptotic properties of goodness-of-fit criteria

3.1 Consent criteria for selection without return scheme

3.2 Asymptotic relative efficiency of goodness-of-fit criteria.

3.3 Criteria based on the number of cells in general layouts.

3.4 Conclusions.

Recommended list of dissertations

  • Asymptotic efficiency of goodness-of-fit tests based on characterization properties of distributions 2011, Candidate of Physical and Mathematical Sciences Volkova, Ksenia Yurievna

  • Large deviations and limit theorems for some random walk functionals 2011, candidate of physical and mathematical sciences Shklyaev, Alexander Viktorovich

  • Limit theorems and large deviations for random walk increments 2004, candidate of physical and mathematical sciences Kozlov, Andrey Mikhailovich

  • On the rate of convergence of statistics of goodness-of-fit tests with power measures of divergence to the chi-square distribution 2010, candidate of physical and mathematical sciences Zubov, Vasily Nikolaevich

  • Probabilities of large deviations of asymptotically homogeneous ergodic Markov chains in space 2004, Doctor of Physical and Mathematical Sciences Korshunov, Dmitry Alekseevich

Introduction of the dissertation (part of the abstract) on the topic “Asymptotic properties of goodness-of-fit criteria for testing hypotheses in a selection scheme without returning, based on filling cells in a generalized placement scheme”

Object of research and relevance of the topic. In the theory of statistical analysis of discrete sequences, a special place is occupied by goodness-of-fit tests for testing a possibly complex null hypothesis, which is that for a random sequence such that

Xi e hi,i = 1, ,n, where hi = (0,1,. ,M), for any i = 1,.,n, and for any k £ 1m the probability of the event

Xi = k) does not depend on r. This means that the sequence is in some sense stationary.

In a number of applied problems, the sequence (Xr-)™ = 1 is considered to be a sequence of colors of balls when choosing without returning until exhaustion from an urn containing n - 1 > 0 balls of color k, k € 1m - We will denote the set of such selections O(n0 - 1, .,pm - 1). Let there be a total of n - 1 balls in the urn, m k=0

Let us denote by r(k) (fc) Jk) rw - Г! , . . . , sequence of numbers of balls of color A; in the sample. Consider the sequence where k)

Kk-p-GPk1.

The sequence h^ is defined using the distances between the locations of adjacent balls of color k in such a way that

Pk Kf = p. 1>=1

The set of sequences h(fc) for all k £ 1m uniquely determines the sequence. Sequences hk for different k are dependent on each other. In particular, any one of them is uniquely determined by all the others. If the cardinality of the set 1m is 2, then the sequence of colors of balls is uniquely determined by the sequence of distances between the places of neighboring balls of the same fixed color. Let there be N - 1 balls of color 0 in an urn containing n - 1 balls of two different colors. We can establish a one-to-one correspondence between the set ffl(N- l,n - N) and the set 9\n,N vectors h(n, N ) = (hi,., hjf) with positive integer components such that K = P. (0.1)

The set 9П)дг corresponds to the set of all different partitions of a positive integer n into N ordered terms.

Having specified a certain probability distribution on the set of vectors £Hn,dr, we obtain the corresponding probability distribution on the set Wl(N - 1,n - N). A set is a subset of a set of vectors with non-negative integer components satisfying (0.1). Distributions of the form will be considered as probability distributions on a set of vectors in the dissertation work

P(%,N) = (n,.,rN)) = P(£„ = ru,v = l,.,N\jr^ = n), (0.2) where. ,£dr - independent non-negative integer random variables.

Distributions of the form (0.2) in /24/ are called generalized schemes for placing n particles in N cells. In particular, if the random variables £b. ,£лг in (0.2) are distributed according to Poisson’s laws with parameters Ai,., Лдг respectively, then the vector h(n,N) has a polynomial distribution with the probabilities of outcomes

Ri = . , L" ,V = \,.,N.

L\ + . . . + AN

If the random variables £ь >&v in (0-2) are identically distributed according to the geometric law where p is any in the interval 0< р < 1, то, как отмечено в /25/,/26/, получающаяся обобщенная схема размещения соответствует равномерному распределению на множестве В силу взаимнооднозначного соответствия между множеством dft(N - 1 ,п - N) и множеством tRn,N получаем равномерное распределение на множестве выборов без возвращения. При этом, вектору расстояний между местами шаров одного цвета взаимно однозначно соответствует вектор частот в обобщенной схеме размещения, и, соответственно, числу расстояний длины г - число ячеек, содержащих ровно г частиц. Для проверки по единственной последовательности гипотезы о том, что она получена как результат выбора без возвращения, и каждая такая выборка имеет одну и ту же вероятность можно проверить гипотезу о том, что вектор расстояний между местами шаров цвета 0 распределен как вектор частот в соответствующей обобщенной схеме размещения п частиц по N ячейкам.

As noted in /14/, /38/, a special place in testing hypotheses about the distribution of frequency vectors h(n, N) = (hi,., /gdr) in generalized schemes for placing n particles in N cells is occupied by criteria based on based on statistics of the form 1 m(N -l,n-N)\ N

LN(h(n,N))=Zfv(hv)

Фн = Ф(-Т7, flQ Hi II-

0.4) where fu, v = 1,2,. and φ - some real-valued functions, N

Mr = E = r), r = 0.1,. 1/=1

The quantities in /27/ were called the number of cells containing exactly g particles.

Statistics of the form (0.3) in /30/ are called separable (additively separable) statistics. If the functions /„ in (0.3) do not depend on u, then such statistics were called in /31/ symmetric separable statistics.

For any r the statistic /xr is a symmetric separable statistic. From equality

E DM = E DFg (0.5) it follows that the class of symmetric separable statistics of hv coincides with the class of linear functions of fir. Moreover, the class of functions of the form (0.4) is wider than the class of symmetric separable statistics.

But = (#o(n, N)) is a sequence of simple null hypotheses that the distribution of the vector h(n, N) is (0.2), where the random variables are,. in (0.2) are identically distributed and k) = pk,k = 0,1,2,., the parameters n, N change in the central region.

Consider some P £ (0,1) and a sequence of, generally speaking, complex alternatives

H = (H(n, N)) such that exists - the maximum number for which, for any simple hypothesis H\ € H(n, N), the inequality holds

РШ > an,N(P)) > Р

We will reject the hypothesis Hq(ti,N) if fm > asm((3). If there is a limit

Шп ~1пР(0н > an,N(P))=u(p,Н), where the probability for each N is calculated under the hypothesis Нк(п, N), then the value ^(/З, Н) is named in /38/ index of the criterion φ at the point (j3, H). The last limit may, generally speaking, not exist. Therefore, in the dissertation work, in addition to the criterion index, the value is considered

Ish (~1pP(fm > al(/?)))

JV->oo N-ooo mean, respectively, the lower and upper limits of the sequence (odg) for N -> oo,

If a criterion index exists, then the criterion's subscript coincides with it. The lower index of the criterion always exists. The higher the value of the criterion index (subscript of the criterion), the better the statistical criterion in this sense. In /38/, the problem of constructing goodness-of-fit criteria for generalized layouts with the highest value of the criterion index in the class of criteria that reject the hypothesis Ho(n,N) at /MO Ml Mt MS iV" iV""""" ~yv" " was solved ^ "where m > 0 is some fixed number, the sequence of constant edg is selected based on the given value of the power of the criterion for the sequence of alternatives, ft is a real function of m + 1 arguments.

The criterion indices are determined by the probabilities of large deviations. As was shown in /38/, the rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of separable statistics when the Cramer condition is satisfied for the random variable /(ξ) is determined by the corresponding Kull-Bak-Leibler-Sanov information distance (the random variable rj satisfies the condition Cramer, if for some R > 0 the generating function of the moments Metr] is finite in the interval \t\< Н /28/).

The question of the probabilities of large deviations of statistics from an unlimited number of fir, as well as arbitrary separable statistics that do not satisfy the Cramer condition, remained open. This did not allow us to finally solve the problem of constructing criteria for testing hypotheses in generalized placement schemes with the highest rate of tending to zero of the probability of an error of the first kind with non-approaching alternatives in the class of criteria based on statistics of the form (0.4). The relevance of the dissertation research is determined by the need to complete the solution to the specified problem.

The purpose of the dissertation work is to construct goodness-of-fit criteria with the highest value of the criterion index (subscript of the criterion) for testing hypotheses in a selection scheme without return in the class of criteria that reject the hypothesis U(n, N) for $.<>,■ ■)><*. (0-7) где ф - функция от счетного количества аргументов, и параметры п, N изменяются в центральной области.

In accordance with the purpose of the study, the following tasks were set:

Investigate the properties of entropy and information distance Kull-Bak - Leibler - Sanov for discrete distributions with a countable number of outcomes;

Investigate the probabilities of large deviations of statistics of the form (0.4);

Investigate the probabilities of large deviations of symmetric separable statistics (0.3) that do not satisfy the Cramer condition;

Find a statistic such that the goodness-of-fit criterion constructed on its basis for testing hypotheses in generalized placement schemes has the highest index value in the class of criteria of the form (0.7).

Scientific novelty:

Scientific and practical value. The work solves a number of questions about the behavior of the probabilities of large deviations in generalized placement schemes. The results obtained can be used in the educational process in the specialties of mathematical statistics and information theory, in the study of statistical procedures for the analysis of discrete sequences, and were used in /3/, /21/ to justify the security of one class of information systems. Provisions for defense:

Reducing the problem of testing, based on a single sequence of ball colors, the hypothesis that this sequence is obtained as a result of a choice without returning until the balls are exhausted from an urn containing balls of two colors, and each such choice has the same probability, to the construction of goodness-of-fit criteria for testing hypotheses in the corresponding generalized layout;

Continuity of the entropy and Kullback-Leibler-Sanov information distance functions on an infinite-dimensional simplex with the introduced logarithmic generalized metric;

A theorem on the rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition in the generalized placement scheme in the semi-exponential case;

Theorem on rough (up to logarithmic equivalence) asymptotics of the probabilities of large deviations for statistics of the form (0.4);

Construction of a goodness-of-fit criterion for testing hypotheses in generalized layouts with the highest index value in the class of criteria of the form (0.7).

Approbation of work. The results were presented at seminars of the Department of Discrete Mathematics of the Mathematical Institute named after. V. A. Steklov RAS, information security department of ITM&VT named after. S. A. Lebedev RAS and at:

Fifth All-Russian Symposium on Applied and Industrial Mathematics. Spring session, Kislovodsk, May 2 - 8, 2004;

Sixth International Petrozavodsk Conference "Probabilistic methods in discrete mathematics" June 10 - 16, 2004;

Second International Conference "Information Systems and Technologies (IST" 2004)", Minsk, November 8-10, 2004;

International conference "Modern Problems and new Trends in Probability Theory", Chernivtsi, Ukraine, June 19 - 26, 2005.

The main results of the work were used in the research work "Apology", carried out by ITMiVT RAS. S. A. Lebedev in the interests of the Federal Service for Technical and Export Control of the Russian Federation, and were included in the report on the implementation of the research stage /21/. Some results of the dissertation were included in the research report "Development of mathematical problems of cryptography" of the Academy of Cryptography of the Russian Federation for 2004 /22/.

The author expresses deep gratitude to the scientific supervisor, Doctor of Physical and Mathematical Sciences A. F. Ronzhin and the scientific consultant, Doctor of Physical and Mathematical Sciences, Senior Researcher A. V. Knyazev. The author expresses gratitude to Doctor of Physical and Mathematical Sciences, Professor A. M. Zubkov and Candidate of Physical and Mathematical Sciences Mathematical Sciences I. A. Kruglov for his attention to the work and a number of valuable comments.

Structure and content of the work.

The first chapter examines the properties of entropy and information distance for distributions on the set of non-negative integers.

In the first paragraph of the first chapter, notations are introduced and the necessary definitions are given. In particular, the following notation is used: x = (xq,x\, . ) - an infinite-dimensional vector with a countable number of components;

H(x) - -Ex^oXvlnx,-, truncm(x) = (x0,x1,.,xm,0,0,.)] f2* = (x, xi > 0, zy = 0.1,. , Oh"< 1}; Q = {х, х, >0,u = 0.1,., o xv = 1); = (x G O, ££L0 = 7);

Ml = o Ue>1|5 € o< Ml - 7МГ1 < 00}. Понятно, что множество £1 соответствует семейству вероятностных распределений на множестве неотрицательных целых чисел, П7 - семейству вероятностных распределений на множестве неотрицательных целых чисел с математическим ожиданием 7.

If y 6E P, then for e > 0 the set will be denoted by Oe(y)

Oe(y) - (x^< уие£ для всех v = 0,1,.}.

In the second paragraph of the first chapter, a theorem on the boundedness of the entropy of discrete distributions with limited mathematical expectation is proved.

Theorem 1. On the boundedness of the entropy of discrete distributions with bounded mathematical expectation.

For any 6 P7

H(x)

If x € fly corresponds to a geometric distribution with a mathematical definition of 7, that is, 7 x„ = (1- р)р\ v = 0.1,., where р = --,

1 + 7 then the equality holds

H(x) = F(<7).

The statement of the theorem can be viewed as the result of a formal application of the Lagrange method of conditional multipliers in the case of an infinite number of variables. The theorem that the only distribution on the set (k, k + 1, k + 2,.) with a given mathematical expectation and maximum entropy is a geometric distribution with a given mathematical expectation is given (without proof) in /47/. The author, however, has given strict proof.

The third paragraph of the first chapter gives the definition of a generalized metric - a metric that allows infinite values.

For x,y € Q the function p(x,y) is defined as the minimal e > O with the property yie~£<хи< уиее для всех и = 0,1,. Если такого е не существует, то полагается, что р(х,у) = оо.

It is proved that the function p(x,y) is a generalized metric on the family of distributions on the set of non-negative integers, as well as on the entire set Cl*. Instead of e in the definition of the metric p(x,y), you can use any other positive number other than 1. The resulting metrics will differ by a multiplicative constant. Let us denote by J(x, y) the information distance

00 £ J(x,y) = E In-.

Here and below it is assumed that 0 In 0 = 0.0 In jj = 0. The information distance is defined for such x, y that x„ = 0 for all and such that y = 0. If this condition is not met, then we will assume J(x,ij) = oo. Let L SP. Then we will denote

J (A Y) = |nf J(x,y).

The fourth paragraph of the first chapter gives the definition of compactness of functions defined on the set Q*. The compactness of a function with a countable number of arguments means that with any degree of accuracy the value of the function can be approximated by the values ​​of this function at points where only a finite number of arguments are non-zero. The compactness of the entropy and information distance functions is proved.

1. For any 0< 7 < оо функция Н(х) компактна на

2. If for some 0< 70 < оо

R e then for any 0<7<оо,г>0 the function x) = J(x,p) is compact on the set

The fifth paragraph of the first chapter discusses the properties of the information distance defined on an infinite-dimensional space. Compared to the finite-dimensional case, the situation with the continuity of the information distance function changes qualitatively. It is shown that the information distance function is not continuous on the set in any of the metrics

Pl&V) = E\Xi~Y»\, u=0

E (xv - Ui)2 v=Q

Рз(х,у) = 8Up\xu-yv\. v

The validity of the following inequalities is proved for the entropy functions H(x) and information distance J(x,p):

1. For any x, x" € fi

N(x) - N(x")\< - 1){Н{х) + Н{х")).

2. If for some x,p e P there exists e > 0 such that x 6 0 £(p), then for any x" £ Q J(x,p) - J(x",p)|< (е"М - 1){Н{х) + Н{х") + ееН(р)).

From these inequalities, taking into account Theorem 1, it follows that the entropy and information distance functions are uniformly continuous on the corresponding subsets of Q in the metric p(x,y)t, namely,

1. For any 7 such that 0< 7 < оо, функция Н(х) равномерно непрерывна на Г2 в метрике р(ж,у);

2. If for some 70, 0< 70 < оо

TO for any 0<7<оои£>0 function

L p(x) = J(x,p) is uniformly continuous on the set Π Oe(p) in the metric p(x,y).

A definition of non-extremal function is given. The non-extremal condition means that the function does not have local extrema, or the function takes the same values ​​at local minima (local maxima). The non-extrema condition weakens the requirement of the absence of local extrema. For example, the function sin x on the set of real numbers has local extrema, but satisfies the non-extremal condition.

Let for some 7 > 0, the region A is given by the condition

A = (x € VLv4>(x) > a), (0.9) where φ(x) is a real-valued function, a is some real constant, inf φ(x)< а < inf ф(х).

The question was studied under what conditions on the function φ when changing the parameters n,N in the central region, ^ -; 7, for all sufficiently large values ​​there are non-negative integers ko, k\,., kn such that k0 + ki + . + kn = N, k\ + 2k2. + control panel - N and

F(ko k\ kp

-£,0,0 ,.)>a.

It is proved that for this it is enough to require that the function φ be non-extremal, compact and continuous in the metric p(x,y), and also that for at least one point x satisfying (0.9), for some e > 0 there exists a finite moment degrees 1 + e and x„ > 0 for any v = 0.1.

In the second chapter, we study the rough (up to logarithmic equivalence) asymptotics of the probability of large deviations of functions from D = (^0) ■ ) Ts "n, 0, .) - the number of cells with a given filling in the central region of change of parameters N, n. Rough The asymptotics of the probabilities of large deviations are sufficient to study the indices of the agreement criteria.

Let the random variables ^ in (0.2) be identically distributed and

P(z) - generating function of a random variable - converges in a circle of radius 1< R < оо. Следуя /38/, для 0 < z < R обозначим через £(z) случайную величину такую, что

Ml+£ = £ i1+ex„< 00.

0.10) k] = Pk, k = 0.1,.

Let's denote

If there is a solution to the equation m Z(z) = ъ then it is unique /38/. Throughout what follows we will assume that pk > O,A; = 0.1,.

The first paragraph of the first paragraph of the second chapter contains the asymptotics of logarithms of probabilities of the form

1пР(/x0 = ko,.,tsp = kp).

The following theorem is proved.

Theorem 2. Rough local theorem on the probabilities of large deviations. Let n, N -» oo so that jj ->7.0<7 < оо, существует z7 - корень уравнения M£(z) = 7, с. в. £(г7) имеет положительную дисперсию. Тогда для любого k G Cl(n,N)

1nP(D = k) = JftpK)) + O(^lniV).

The statement of the theorem follows directly from the formula for the joint distribution fii,. fin in /26/ and the following estimate: if non-negative integer values ​​, Нп satisfy the condition

Hi + 2d2 + + PNn = n, then the number of non-zero values ​​among them is 0(l/n). This is a rough estimate and does not claim to be new. The number of non-zero CGs in generalized layout schemes does not exceed the value of the maximum filling of cells, which in the central region, with a probability tending to 1, does not exceed the value O(lnn) /25/, /27/. Nevertheless, the resulting estimate 0(y/n) is satisfied with probability 1 and is sufficient to obtain rough asymptotics.

In the second paragraph of the first paragraph of the second chapter, the value of the limit is found where adg is a sequence of real numbers converging to some a G R, φ(x) is a real-valued function. The following theorem is proved.

Theorem 3. Rough integral theorem on the probabilities of large deviations. Let the conditions of Theorem 2 be satisfied, for some r > 0, C > 0 the real function φ(x) is compact, uniformly continuous in the metric p on the set

A = 0r+<;(p(z7)) П Ц7+с] и удовлетворяет условию неэкстремальности на множестве fly. Если для некоторой константы а такой, что inf ф(х) < а < sup ф(х). xeily существует вектор ра € fi7 П 0r(p(z7)); такой, что

Ф(ra) > a and j(( (x) >a,xe P7),p(2;7)) = 7(pa,p(*y)) mo for any sequence a^ converging to a,

Jim -vbPW%%,.)>aN) = J(pa,p(2h)). (0.11)

With additional restrictions on the function φ(x), the information distance J(pa,p(z7)) in (2.3) can be calculated more specifically. Namely, the following theorem is true. Theorem 4. On information distance. Let for some 0< 7 < оо для некоторвх г >0, C > 0, the real function φ(x) and its first-order partial derivatives are compact and uniformly continuous in the generalized metric p(x, y) on the set p G

A = Og(p) P %+c] there exist T > 0, R > 0 such that for all \t\<Т,0 < z < R,x е А

E^exp^-f(x))< оо,

0(a;)exp(t-< со, i/=o oxv 0X1/ для некоторого е >O oo Q pvv1+£zu exp(t-ph(x))< оо, (0.13) и существует единственный вектор x(z,t), удовлетворяющий системе уравнений xv(z, t) = pvzv ехр {Ь-ф(х(г, t))}, v = 0,1,. функция ф(х) удовлетворяет на множестве А условию неэкстремальности, а - некоторая константа, ф(р) < а < sup ф(:x)(z,t),

0

00 vpv(za,ta) = 7, 1/=0

0(p(*aL)) = a, where

Then p(za, ta) € and

J((x e А,ф(х) = а),р) = J(p(za, ta),p)

00 d 00 d = l\nza + taYl ir- (x(za,ta)) - In E^r/exp(ta-z- (p(zatta))). j/=0 C^i/ t^=0

If the function f(x) is a linear function, and the function f(x) is defined using equality (0.5), then condition (0.12) turns into Cramer’s condition for the random variable f(£(z)). Condition (0.13) is a form of condition (0.10) and is used to prove the presence in domains of the form (x G f(x) > a) of at least one point from 0(n, N) for all sufficiently large n, N.

Let ^)(n, N) = (hi,., /gdr) be the frequency vector in the generalized placement scheme (0.2). As a corollary of Theorems 3 and 4, the following theorem is formulated.

Theorem 5. Rough integral theorem on the probabilities of large deviations of symmetric separable statistics in a generalized placement scheme.

Let n, N -» oo so that ^ - 7, 0< 7 < оо, существует z1 - корень уравнения М£(,г) = 7, с. в. £(27) имеет положительную дисперсию и максимальный шаг распределения 1, а - некоторая константа, f(x) - действительная функция, а < Mf(^(z1)), существуют Т >0,R > 0 such that for all |t|<Т,0 < z < R,

00 oo, u=0 there are such ta\

E vVi/("01 ta) = b where f(v)p"(za,ta) = a, 1/=0

Then for any sequence adg converging to a,

Jim - - InF»(- £ f(h„) > aN) = J(p(za,ta),p(z7))

00 7 In 2a + taa - In £ p^/e^M i/=0

This theorem was first proven by A.F. Ronzhin in /38/ using the saddle point method.

In the second paragraph of the second chapter, the probabilities of large deviations of separable statistics in generalized cxj^iax placements are studied in the case of failure to satisfy the Cramer condition for the random variable f(€(z)). Cramer's condition for the random variable f(£(z)) is not satisfied, in particular, if £(z) is a Poisson random variable and f(x) is x2. Note that Cramer's condition for the separable statistics themselves in generalized allocation schemes is always satisfied, since for any fixed n, N the number of possible outcomes in these schemes is finite.

As noted in /2/, if the Cramer condition is not satisfied, then to find the asymptotics of the probabilities of large deviations of sums of identically distributed random variables, additional ones are required. f

V and. . I conditions of correct change on the distribution of the term. In progress j

O, 5 the case corresponding to the fulfillment of condition (3) in /2/ is considered, that is, the seven-exponential case. Let P(£i = k) > 0 for all k = 0,1. and the function p(k) = -\nP(^ = k), can be extended to a function of continuous argument - a regularly varying function of order p, 0< р < со /45/, то есть положительной функции такой, что при t ->oo p(tx) xr.

Let the function f(x) for sufficiently large values ​​of the argument be a positive strictly increasing, regularly varying function of order. Let us define the function cp(x) by setting for sufficiently large x φ) = p(Γ\x)).

On the rest of the numerical axis, ip(x) can be specified in an arbitrary limited measurable way.

Then s. V. /(£i) has moments of any order and does not satisfy the Cramer condition, p(x) = o(x) as x -> ω, and the following Theorem 6 is valid. Let the function ip(x) be monotonically nondecreasing for sufficiently large x, fg^ction does not increase monotonically, n, N -> oo so that jj - A, 0< Л < оо; гд - единственный корень уравнения M^i(^) = Л, тогда для любого с >b(z\), where b(z) = M/(£i(.z)), there is a limit CN) = -(c - b(z\))4.

It follows from Theorem b that if Cramer’s condition is not met, the limit lim 1 InP(LN(h(n, N)) > cN) = 0, ^ ^ iv-too iv which proves the validity of the hypothesis stated in /39/. Thus, the value of the index of the agreement criterion in generalized placement schemes and failure to fulfill Cramer’s condition is always equal to zero. In this case, in the class of criteria, when Cramer’s condition is satisfied, criteria with a non-zero index value are constructed. From this we can conclude that using criteria whose statistics do not satisfy the Cramer condition, for example, the chi-square test in a polynomial scheme, to construct goodness-of-fit tests for testing hypotheses for non-converging alternatives in the indicated sense is asymptotically ineffective. A similar conclusion was made in /54/ based on the results of a comparison of chi-square and maximum likelihood ratio statistics in a polynomial scheme.

The third chapter solves the problem of constructing goodness-of-fit criteria with the largest value of the criterion index (the largest value of the subscript of the criterion) to test hypotheses in generalized placement schemes. Based on the results of the first and second chapters on the properties of the entropy functions, information distance and probabilities of large deviations, in the third chapter a function of the form (0.4) is found such that the goodness-of-fit criterion constructed on its basis has the largest value of the exact subscript in the class of criteria under consideration. The following theorem is proved.

Theorem 7. On the existence of an index. Let the conditions of Theorem 3 be satisfied: 0< /3 < 1, Н = Hp(i),Hp(2>,. is a sequence of alternative distributions, а,ф((3, N) is the maximum number for which, under the hypothesis Нр<ло выполнено неравенство существует предел lim^-оо о>φ(P, N) - a. Then at the point (/3, H) there is a criterion index φ

Zff, H) = 3((φ(x) > a, x £ ^.PW).

Shy)<ШН)>where w/fo fh h v^l ^

The Conclusion sets out the results obtained in their relationship with the general goal and specific tasks posed in the dissertation, formulates conclusions based on the results of the dissertation research, indicates the scientific novelty, theoretical and practical value of the work, as well as specific scientific tasks identified by the author and the solution of which seems relevant .

Brief review of the literature on the research topic. The thesis examines the problem of constructing agreement criteria in generalized placement schemes with the highest value of the criterion index in the class of functions of the form (0.4) with non-converging alternatives.

Generalized layout schemes were introduced by V.F. Kolchin in /24/. The quantities in the polynomial scheme were called the number of cells with g pellets and were studied in detail in the monograph by V. F. Kolchin, B. A. Sevastyanov, V. P. Chistyakov /27/. The values ​​of fir in generalized layouts were studied by V.F. Kolchin in /25/, /26/. Statistics of the form (0.3) were first considered by Yu. I. Medvedev in /30/ and were called separable (additively separable) statistics. If the functions /„ in (0.3) do not depend on u, such statistics were called in /31/ symmetric separable statistics. The asymptotic behavior of the moments of separable statistics in generalized allocation schemes was obtained by G. I. Ivchenko in /9/. Limit theorems for a generalized layout scheme were also considered in /23/. Reviews of the results of limit theorems and agreement criteria in discrete probabilistic schemes of type (0.2) were given by V. A. Ivanov, G. I. Ivchenko, Yu. I. Medvedev in /8/ and G. I. Ivchenko, Yu. I. Medvedev , A.F. Ronzhin in /14/. Agreement criteria for generalized layouts were considered by A.F. Ronzhin in /38/.

A comparison of the properties of statistical criteria in these works was carried out from the point of view of relative asymptotic efficiency. The case of converging (contigual) hypotheses was considered - efficiency in the sense of Pitman and non-converging hypotheses - efficiency in the sense of Bahadur, Hodges - Lehman and Chernov. The relationship between different types of relative performance statistical tests is discussed, for example, in /49/. As follows from the results of 10. I. Medvedev in /31/ on the distribution of separable statistics in a polynomial scheme, the greatest asymptotic power under convergent hypotheses in the class of separable statistics on the frequencies of outcomes in a polynomial scheme has a criterion based on the chi-square statistic. This result was generalized by A.F. Ronzhin for circuits of type (0.2) in /38/. I. I. Viktorova and V. P. Chistyakov in /4/ constructed an optimal criterion for a polynomial scheme in the class of linear functions of /xr. A.F. Ronzhin in /38/ constructed a criterion that, given a sequence of alternatives that are not close to the null hypothesis, minimizes the logarithmic rate at which the probability of an error of the first kind tends to zero, in the class of statistics of the form (0.6). A comparison of the relative performance of the chi-square and maximum likelihood ratio statistics under approaching and non-approximating hypotheses was carried out in /54/.

The thesis considered the case of non-converging hypotheses. Studying the relative statistical effectiveness of criteria under non-converging hypotheses requires studying the probabilities of extremely large deviations - of the order of 0(i/n). For the first time, such a problem for a polynomial distribution with a fixed number of outcomes was solved by I. N. Sanov in /40/. The asymptotic optimality of goodness-of-fit tests for testing simple and complex hypotheses for a multinomial distribution in the case of a finite number of outcomes with non-converging alternatives was considered in /48/. The properties of information distance were previously considered by Kullback, Leibler /29/,/53/ and I. II. Sanov /40/, as well as Hoeffding /48/. In these works, the continuity of information distance was considered on finite-dimensional spaces in the Euclidean metric. A number of authors considered a sequence of spaces with increasing dimension, for example, in the work of Yu. V. Prokhorov /37/ or in the work of V. I. Bogachev, A. V. Kolesnikov /1/. Rough (up to logarithmic equivalence) theorems on the probabilities of large deviations of separable statistics in generalized allocation schemes under the Cramer condition were obtained by A.F. Ronzhin in /38/. A. N. Timashev in /42/,/43/ obtained exact (up to equivalence) multidimensional integral and local limit theorems on the probabilities of large deviations of the vector fir^n, N),., iir.(n,N), where s, r\,., rs - fixed integers,

ABOUT<П < .

The study of the probabilities of large deviations when the Cramer condition is not met for the case of independent random variables was carried out in the works of A. V. Nagaev /35/. The method of conjugate distributions is described by Feller /45/.

Statistical problems of testing hypotheses and estimating parameters in a selection scheme without return in a slightly different formulation were considered by G. I. Ivchenko, V. V. Levin, E. E. Timonina /10/, /15/, where estimation problems were solved for a finite population, when the number of its elements is an unknown quantity, the asymptotic normality of multivariate S - statistics from s independent samples in a selection scheme without reversion was proved. The problem of studying random variables associated with repetitions in sequences of independent trials was studied by A. M. Zubkov, V. G. Mikhailov, A. M. Shoitov in /6/, /7/, /32/, /33/, /34/ . An analysis of the main statistical problems of estimation and testing of hypotheses within the framework of the general Markov-Pólya model was carried out by G. I. Ivchenko, Yu. I. Medvedev in /13/, a probabilistic analysis of which was given in /11/. A method for specifying non-uniform probability measures on a set of combinatorial objects, which is not reducible to the generalized placement scheme (0.2), was described in G. I. Ivchenko, Yu. I. Medvedev /12/. A number of problems in probability theory, in which the answer can be obtained as a result of calculations using recurrent formulas, were indicated by A. M. Zubkov in /5/.

Inequalities for the entropy of discrete distributions were obtained in /50/ (cited from the abstract of A. M. Zubkov in RZhMat). If (pn)^Lo is the probability distribution, oo

Рп = Е Рк, к=тг

A = supp^Pn+i< оо (0.14) п>0 and

F(x) = (x + 1) In (x + 1) - x In x, then for the entropy I of this probability distribution

00 i = - 5Z Рк^Рк к=0 the inequalities are valid -L 1 00 00 Р

I + (In -f-) £ (Arn - Rn+1)< F(А) < Я + £ (АРп - P„+i)(ln

L D p=P -t p.4-1 and inequalities turn into equalities if

Рп= (xf1)n+vn>Q. (0.15)

Note that the extremal distribution (0.15) is a geometric distribution with mathematical expectation A, and the function F(A) of the parameter (0.14) coincides with the function of the mathematical expectation in Theorem 1.

Similar dissertations in the specialty "Probability Theory and Mathematical Statistics", 01/01/05 code VAK

  • Asymptotic efficiency of scale-parameter-free exponential tests 2005, Candidate of Physical and Mathematical Sciences Chirina, Anna Vladimirovna

  • Some problems in probability theory and mathematical statistics related to the Laplace distribution 2010, Candidate of Physical and Mathematical Sciences Lyamin, Oleg Olegovich

  • Limit theorems in problems of dense embedding and dense series in discrete random sequences 2009, Candidate of Physical and Mathematical Sciences Mezhennaya, Natalya Mikhailovna

  • Limit theorems for the number of intersections of a strip by random walk trajectories 2006, candidate of physical and mathematical sciences Orlova, Nina Gennadievna

  • Optimization of the structure of moment estimates of the accuracy of normal approximation for distributions of sums of independent random variables 2013, Doctor of Physical and Mathematical Sciences Shevtsova, Irina Gennadievna

Conclusion of the dissertation on the topic “Probability Theory and Mathematical Statistics”, Kolodzei, Alexander Vladimirovich

3.4. conclusions

In this chapter, based on the results of previous chapters, it was possible to construct a goodness-of-fit criterion for testing hypotheses in generalized placement schemes with the highest logarithmic rate of tending to zero probabilities of errors of the first kind, with a fixed probability of errors of the first kind and non-converging alternatives. ~"

Conclusion

The purpose of the dissertation work was to construct goodness-of-fit criteria for testing hypotheses in a selection scheme without returning from an urn containing balls of 2 colors. The author decided to study statistics based on the frequencies of distances between balls of the same color. In this formulation, the problem was reduced to the task of testing hypotheses in a suitable generalized layout.

The dissertation work included

The properties of entropy and information distance of discrete distributions with an unlimited number of outcomes and limited mathematical expectation have been studied;

A rough (up to logarithmic equivalence) asymptotic behavior of the probabilities of large deviations of a wide class of statistics in a generalized placement scheme is obtained;

Based on the results obtained, a criterion function with the highest logarithmic rate of tending to zero of the probability of an error of the first type with a fixed probability of an error of the second type and non-approaching alternatives was constructed;

It has been proven that statistics that do not satisfy the Cramer condition have a lower rate of convergence to zero of the probabilities of large deviations compared to statistics that satisfy this condition.

The scientific novelty of the work is as follows.

The concept of a generalized metric is given - a function that admits infinite values ​​and satisfies the axioms of identity, symmetry and triangle inequality. A generalized metric is found and sets are indicated on which the entropy and information distance functions, defined on a family of discrete distributions with a countable number of outcomes, are continuous in this metric;

In a generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics was found for the probabilities of large deviations of statistics of the form (0.4) satisfying the corresponding form of Cramer’s condition;

In a generalized placement scheme, a rough (up to logarithmic equivalence) asymptotics is found for the probabilities of large deviations of symmetric separable statistics that do not satisfy the Cramer condition;

In the class of criteria of the form (0.7), a criterion with the highest value of the criterion index is constructed.

The work solves a number of questions about the behavior of the probabilities of large deviations in generalized placement schemes. The results obtained can be used in the educational process in the specialties of mathematical statistics and information theory, in the study of statistical procedures for the analysis of discrete sequences, and were used in /3/, /21/ to justify the security of one class of information systems.

However, a number of questions remain open. The author limited himself to considering the central zone of changes in parameters n, N of generalized schemes for placing n particles in N cells. If the carrier of the distribution of random variables generating the generalized arrangement scheme (0.2) is not a set of the form r, r + 1, r + 2,., then when proving the continuity of the information distance function and studying the probabilities of large deviations, it is necessary to take into account the arithmetic structure of such a carrier that was not considered in the author's work. For the practical application of criteria built on the basis of the proposed function with the maximum index value, it is necessary to study its distribution both under the null hypothesis and under alternatives, including converging ones. It is also of interest to transfer the developed methods and generalize the results obtained to other probabilistic schemes other than generalized placement schemes.

If - frequencies of distances between outcome numbers 0 in a binomial scheme with probabilities of outcomes po> 1 - Po, then it can be shown that in this case

Pb = kh.t fin = kn) = I(± iki = n)(kl + --, (3.3) v=\ K\ \ . Kn\ where

O* = Po~1(1 ~Po),v =

From the analysis of the formula for the joint distribution of values ​​of cg in a generalized arrangement scheme, proven in /26/, it follows that distribution (3.3), generally speaking, cannot be represented in the general case as a joint distribution of values ​​of cg in any generalized arrangement of particles by cells. This distribution is a special case of distributions on the set of combinatorial objects introduced in /12/. It seems an urgent task to transfer the results of the dissertation work for generalized placement schemes to this case, which was discussed in /52/.

If the number of outcomes in a choice-without-return or polynomial-allocation scheme is greater than two, then the joint frequency distribution of distances between adjacent identical outcomes can no longer be represented in such a simple way. So far it is only possible to calculate the mathematical expectation and dispersion of the number of such distances /51/.

List of references for dissertation research Candidate of Physical and Mathematical Sciences Kolodzei, Alexander Vladimirovich, 2006

1. Bogachev V.I., Kolesnikov A.V. Nonlinear transformations of convex measures and entropy of Radon-Nikodym densities // Reports of the Academy of Sciences. - 2004. - T. 207. - 2. - P. 155 - 159.

2. Vidyakin V.V., Kolodzei A.V. Statistical detection of covert channels in data transmission networks // Proc. report II International conf. "Information systems and technologies IST" 2004" (Minsk, October 8-10, 2004) Minsk: BSU, 2004. - Part 1. - pp. 116 - 117.

3. Viktorova I. I., Chistyakov V. P. Some generalizations of the empty box criterion // Theory Probab. and its applications. - 1966. - T. XI. - 2. P. 306-313.

4. Zubkov A. M. Recurrent formulas for calculating functionals of ods of discrete random variables // Review of Appl. and industrial math. 1996. - T. 3. - 4. - P. 567 - 573.

5. G. Zubkov A. M., Mikhailov V. G. Limit distributions of random variables associated with long repetitions in a sequence of independent tests // Theory Probab. and its applications. - 1974. - T. XIX. 1. - pp. 173 - 181.

6. Zubkov A. M., Mikhailov V. G. On repetitions of s - chains in a sequence of independent quantities // Theory Probab. and its application. - 1979. T. XXIV. - 2. - P. 267 - 273.

7. Ivanov V. A., Ivchenko G. I., Medvedev Yu. I. Discrete problems in probability theory // Results of Science and Technology. Ser. theory of probability, mathematics. stat., theor. cybern. T. 23. - M.: VINITI, 1984. P. 3 -60.

8. Ivchenko G. I. On moments of separable statistics in a generalized allocation scheme // Mat. notes. 1986. - T. 39. - 2. - P. 284 - 293.

9. Ivchenko G. I., Levin V. V. Asymptotic normality in a selection scheme without return // Theory Probab. and it is applied. - 1978.- T. XXIII. 1. - pp. 97 - 108.

10. Ivchenko G.I., Medvedev Yu.I. On the Markov-Polya urn scheme: from 1917 to the present day // Review applied. and industrial math. - 1996.- T. 3. 4. - P. 484-511.

11. Ivchenko G.I., Medvedev Yu.I. Random combinatorial objects // Reports of the Academy of Sciences. 2004. - T. 396. - 2. - P. 151 - 154.

12. Ivchenko G. I., Medvedev Yu. I. Statistical problems associated with the organization of control over the processes of generating discrete random sequences // Diskretn. math. - 2000. - T. 12. - 2. S. 3 - 24.

13. Ivchenko G. I., Medvedev Yu. I., Ronzhin A. F. Separable statistics and goodness-of-fit criteria for polynomial samples // Proceedings of Mathematics. Institute of the USSR Academy of Sciences. 1986. - T. 177. - P. 60 - 74.

14. Ivchenko G. I., Timonina E. E. On estimation when choosing from a finite population // Mat. notes. - 1980. - T. 28. - 4. - P. 623 - 633.

15. Kolodzei A. V. Theorem on the probabilities of large deviations for separable statistics that do not satisfy the Cramer condition // Diskretn. math. 2005. - T. 17. - 2. - P. 87 - 94.

16. Kolodzei A. V. Entropy of discrete distributions and the probability of large deviations of functions from filling cells in generalized layouts // Review of Appl. and industrial math. - 2005. - T. 12. 2. - P. 248 - 252.

17. Kolodzey A. V. Statistical criteria for identifying hidden channels based on changing the order of messages // Research work "Apology": Report / FSTEC of the Russian Federation, Head A. V. Knyazev. Inv. 7 chipboards - M., 2004. - P. 96 - 128.

18. Kolodzei A.V., Ronzhin A.F. About some statistics related to checking the homogeneity of random discrete sequences // Research work "Development of mathematical problems of cryptography" N 4 2004.: Report / AK RF, - M., 2004 .

19. Kolchin A. V. Limit theorems for a generalized layout scheme // Diskretn. math. 2003. - T. 15. - 4. - P. 148 - 157.

20. Kolchin V.F. One class of limit theorems for conditional distributions // Lit. math. Sat. - 1968. - T. 8. - 1. - P. 111 - 126.

21. Kolchin V. F. Random graphs. 2nd ed. - M.: FIZMATLIT, 2004. - 256 p.

22. Kolchin V. F. Random mappings. - M.: Nauka, 1984. - 208 p.

23. Kolchin V.F., Sevastyanov B.A., Chistyakov V.P. Random placements. M.: Nauka, 1976. - 223 p.

24. Kramer G. // Uspekhi Matem. Sciences. - 1944. - high. 10. - pp. 166 - 178.

25. Kulbak S. Information theory and statistics. - M.: Nauka, 1967. - 408 p.

26. Medvedev Yu. I. Some theorems on the asymptotic distribution of the chi-square statistic // Dokl. Academy of Sciences of the USSR. - 1970. - T. 192. 5. - P. 997 - 989.

27. Medvedev Yu. I. Separable statistics in a polynomial scheme I; II. // Theory Prob. and its use. - 1977. - T. 22. - 1. - P. 3 - 17; 1977. T. 22. - 3. - P. 623 - 631.

28. Mikhailov V. G. Limit distributions of random variables associated with multiple long repetitions in a sequence of independent tests // Theory Probab. and its applications. - 1974. T. 19. - 1. - P. 182 - 187.

29. Mikhailov V. G. Central limit theorem for the number of incomplete long repetitions // Theory Probab. and its applications. - 1975. - T. 20. 4. - P. 880 - 884.

30. Mikhailov V. G., Shoitov A. M. Structural equivalence of s - chains in random discrete sequences // Discrete. math. 2003. - T. 15, - 4. - P. 7 - 34.

31. Nagaev A.V. Integral limit theorems taking into account probabilities of large deviations. I. // Theory Probab. and it is applied. -1969. T. 14. 1. - pp. 51 - 63.

32. Petrov V. V. Sums of independent random variables. - M.: Nauka, 1972. 416 p.

33. Prokhorov Yu. V. Limit theorems for sums of random vectors whose dimension tends to infinity // Theory Probab. and its applications. 1990. - T. 35. - 4. - P. 751 - 753.

34. Ronzhin A.F. Criteria for generalized particle placement schemes // Theory Probab. and its applications. - 1988. - T. 33. - 1. - P. 94 - 104.

35. Ronzhin A.F. Theorem on the probabilities of large deviations for separable statistics and its statistical application // Mat. notes. 1984. - T. 36. - 4. - P. 610 - 615.

36. Sanov I. N. On the probabilities of large deviations of random variables // Mat. Sat. 1957. - T. 42. - 1 (84). - S.I - 44.

37. Seneta E. Correctly changing functions. M.: Nauka, 1985. - 144 p.

38. Timashev A. N. Multidimensional integral theorem on large deviations in an equiprobable placement scheme // Diskret, Mat. - 1992. T. 4. - 4. - P. 74 - 81.

39. Timashev A. N. Multidimensional local theorem on large deviations in an equiprobable placement scheme // Diskretn. math. - 1990. T. 2. - 2. - P. 143 - 149.

40. Fedoryuk M.V. Pass method. M.: Nauka, 1977. 368 p.

41. Feller V. Introduction to probability theory and its applications. T. 2. - M.: Mir, 1984. 738 p.

42. Shannon K. Mathematical theory of communication // Works on information theory and cybernetics: Trans. from English / M., IL, 1963, p. 243 - 332.

43. Conrad K. Probability Distribution and Maximum Entropy // http://www.math.uconn.edu/~kconrad/blurbs/entropypost.pdf

44. Hoeffding W. Asymptotically optimal tests for multinomial distribution // Ann. Math. Statist. 1965. - T. 36. - pp. 369 - 408.

45. Inglot T,. Rallenberg W. S. M., Ledwina T. Vanishing shortcoming and asymptotic relative efficiency // Ann. Statist. - 2000. - T. 28. - P. 215 238.

46. ​​Jurdas C., Pecaric J., Roki R., Sarapa N., On an inequality for the entropy of probability distribution // Math. Inequal. and Appl. - 2001. T. 4. - 2. - P. 209 - 214. (RZhMat. - 2005. - 05.07-13B.16).

47. Kolodzey A. V., Ronzhin A. F., Goodness of Fit Tests for Random Combinatoric Objects // Proc. report intl. conf. Modern Problems and new Trends in Probability Theory, (Chernivtsi, June 19 - 26, 2005) - Kyiv: Institute of Mathematics, 2005. Part 1. P. 122.

48. Kullback S. and Leibler R. A. On information and sufficiency // Ann. Math. Statist. 1951. - T. 22. - pp. 79 - 86.

49. Quine M.P., Robinson J. Efficiency of chi-square and likelihood ratio goodness of fit tests // Ann. Statist. 1985. - T. 13. - 2. - P. 727 -742.

Please note the above scientific texts posted for information purposes and obtained through original dissertation text recognition (OCR). Therefore, they may contain errors associated with imperfect recognition algorithms. There are no such errors in the PDF files of dissertations and abstracts that we deliver.

Definition. The direction determined by a non-zero vector is called asymptotic direction relative to the second order line, if any a straight line of this direction (that is, parallel to the vector) either has at most one common point with the line, or is contained in this line.

? How many common points can a second-order line and a straight line have? asymptotic direction relative to this line?

In the general theory of second order lines it is proven that if

Then the non-zero vector ( specifies the asymptotic direction relative to the line

(general criterion for asymptotic direction).

For second order lines

if , then there are no asymptotic directions,

if then there are two asymptotic directions,

if then there is only one asymptotic direction.

The following lemma turns out to be useful ( criterion for the asymptotic direction of a line of parabolic type).

Lemma . Let be a line of parabolic type.

The non-zero vector has an asymptotic direction

relatively . (5)

(Problem: Prove the lemma.)

Definition. The straight line of the asymptotic direction is called asymptote line of the second order, if this line either does not intersect with or is contained in it.

Theorem . If it has an asymptotic direction relative to , then the asymptote parallel to the vector is determined by the equation

Let's fill out the table.

TASKS.

1. Find the vectors of asymptotic directions for the following second order lines:

4 - hyperbolic type two asymptotic directions.

Let's use the asymptotic direction criterion:

Has an asymptotic direction relative to this line 4.

If =0, then =0, that is, zero. Then Divide by We get quadratic equation: , where t = . We solve this quadratic equation and find two solutions: t = 4 and t = 1. Then the asymptotic directions of the line .

(Two methods can be considered, since the line is of a parabolic type.)

2. Find out whether the coordinate axes have asymptotic directions relative to the second-order lines:

3. Write the general equation of the second order line for which

a) the x-axis has an asymptotic direction;

b) Both coordinate axes have asymptotic directions;

c) the coordinate axes have asymptotic directions and O is the center of the line.

4. Write the equations of the asymptotes for the lines:

a) ng w:val="EN-US"/>y=0"> ;

5. Prove that if a second-order line has two non-parallel asymptotes, then their intersection point is the center of this line.

Note: Since there are two non-parallel asymptotes, there are two asymptotic directions, then , and, therefore, the line is central.

Write the equations of the asymptotes in general view and a system for finding the center. Everything is obvious.

6.(No. 920) Write the equation of a hyperbola passing through point A(0, -5) and having asymptotes x – 1 = 0 and 2x – y + 1 = 0.

Note. Use the statement from the previous problem.

Homework . , No. 915 (c, e, f), No. 916 (c, d, e), No. 920 (if you didn’t have time);

Cribs;

Silaev, Timoshenko. Practical tasks in geometry,

1st semester. P.67, questions 1-8, p.70, questions 1-3 (oral).

DIAMETERS OF SECOND ORDER LINES.

CONNECTED DIAMETERS.

An affine coordinate system is given.

Definition. Diameter a second-order line conjugate to a vector of non-asymptotic direction with respect to , is the set of midpoints of all chords of the line parallel to the vector .

During the lecture it was proven that diameter is a straight line and its equation was obtained

Recommendations: Show (on an ellipse) how it is constructed (we set a non-asymptotic direction; draw [two] straight lines of this direction intersecting the line; find the midpoints of the chords to be cut off; draw a straight line through the midpoints - this is the diameter).

Discuss:

1. Why in determining the diameter is a vector of a non-asymptotic direction taken. If they cannot answer, then ask them to construct the diameter, for example, for a parabola.

2. Does any second-order line have at least one diameter? Why?

3. During the lecture it was proven that diameter is a straight line. The midpoint of which chord is point M in the figure?


4. Look at the parentheses in equation (7). What do they remind you of?

Conclusion: 1) each center belongs to each diameter;

2) if there is a line of centers, then there is a single diameter.

5. What direction do the diameters of a parabolic line have? (Asymptotic)

Proof (probably in lecture).

Let the diameter d, given by equation (7`), be conjugate to a vector of non-asymptotic direction. Then its direction vector

(-(), ). Let us show that this vector has an asymptotic direction. Let us use the criterion of the asymptotic direction vector for a line of parabolic type (see (5)). Let’s substitute and make sure (don’t forget that .

6. How many diameters does a parabola have? Their relative position? How many diameters do the remaining parabolic lines have? Why?

7. How to construct the total diameter of some pairs of second-order lines (see questions 30, 31 below).

8. We fill out the table and be sure to make drawings.

1. . Write an equation for the set of midpoints of all chords parallel to the vector

2. Write the equation for the diameter d passing through the point K(1,-2) for the line.

Solution steps:

1st method.

1. Determine the type (to know how the diameters of this line behave).

In this case, the line is central, then all diameters pass through center C.

2. We compose the equation of a straight line passing through two points K and C. This is the desired diameter.

2nd method.

1. We write the equation for diameter d in the form (7`).

2. Substituting the coordinates of point K into this equation, we find the relationship between the coordinates of the vector conjugate to the diameter d.

3. We set this vector, taking into account the found dependence, and compose an equation for diameter d.

In this problem, it is easier to calculate using the second method.

3. . Write an equation for the diameter parallel to the x-axis.

4. Find the midpoint of the chord cut off by the line

on the straight line x + 3y – 12 =0.

Directions to the solution: Of course, you can find the points of intersection of the straight line and line data, and then the middle of the resulting segment. The desire to do this disappears if we take, for example, a straight line with the equation x +3y – 2009 =0.

Loading...