Statistical tests for text homogeneity: using forward and backward processes of numbers of different words Информационное сообщение
Журнал |
Glottometrics
ISSN: 1617-8351 , E-ISSN: 2625-8226 |
||||||||
---|---|---|---|---|---|---|---|---|---|
Вых. Данные | Год: 2022, Том: 53, Страницы: 42-58 Страниц : 17 DOI: 10.53482/2022_53_401 | ||||||||
Ключевые слова | Zipf’s law, weak convergence, Gaussian process, statistical test, text homogeneity, urn model. | ||||||||
Авторы |
|
||||||||
Организации |
|
Реферат:
The processes of growth in the number of diverse words in a text, when reading in the forward and backward directions, are studied in this article. Based upon the statistics achieved from the difference between these two processes, we construct a statistical test. This statistical test is used for text homogeneity checks. The elementary model states that words in a text are selected from some dictionary independent of each other according to the Zipf–Mandelbrot law. P-values of the statistical test are calculated based on the elementary probabilistic model using the asymptotic normality of corresponding statistics. At last but not least, this statistical test is applied for the analysis of homogeneity of sequences of sonnets.
Библиографическая ссылка:
Abebe B.
, Chebunin M.
, Kovalevskii A.
, Zakrevskaya N.
Statistical tests for text homogeneity: using forward and backward processes of numbers of different words
Glottometrics. 2022. V.53. P.42-58. DOI: 10.53482/2022_53_401 WOS Scopus РИНЦ OpenAlex
Statistical tests for text homogeneity: using forward and backward processes of numbers of different words
Glottometrics. 2022. V.53. P.42-58. DOI: 10.53482/2022_53_401 WOS Scopus РИНЦ OpenAlex
Даты:
Опубликована в печати: | 1 янв. 2024 г. |
Идентификаторы БД:
Web of science: | WOS:000975069100003 |
Scopus: | 2-s2.0-85146477404 |
РИНЦ: | 59204255 |
OpenAlex: | W4316923420 |