{"id":3213,"date":"2025-11-17T13:29:47","date_gmt":"2025-11-17T11:29:47","guid":{"rendered":"http:\/\/158.129.51.247:8888\/?p=3213"},"modified":"2025-11-17T13:38:59","modified_gmt":"2025-11-17T11:38:59","slug":"publikacijos-pasirode-clarin-lt-mokslininkiu-straipsnis","status":"publish","type":"post","link":"https:\/\/clarin-lt.lt\/?p=3213","title":{"rendered":"Publikacijos. Pasirod\u0117 CLARIN-LT mokslininki\u0173 straipsnis"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"19\" src=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-1024x19.png\" alt=\"\" class=\"wp-image-3215\" srcset=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-1024x19.png 1024w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-300x5.png 300w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-768x14.png 768w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-1536x28.png 1536w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-100x2.png 100w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-150x3.png 150w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-200x4.png 200w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-450x8.png 450w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-600x11.png 600w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12-900x16.png 900w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-12.png 1650w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>2025 liepos 23 d. moksliniame \u017eurnale <a href=\"https:\/\/kalbos.ktu.lt\/index.php\/KStud\/index\">\u201e<a href=\"https:\/\/kalbos.ktu.lt\/index.php\/KStud\/index\">Kalb\u0173 studijos<\/a>\u201c<\/a> (angl. <em>Studies about languages<\/em>) publikuotas CLARIN-LT tyr\u0117j\u0173 \u2013 <a href=\"https:\/\/www.vdu.lt\/cris\/entities\/person\/jolanta-kovalevskaite\"><em>Jolantos Kovalevskait\u0117s<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.vdu.lt\/cris\/entities\/person\/erika-rimkute\"><em>Erikos Rimkut\u0117s<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.vdu.lt\/cris\/entities\/person\/jurgita-vaicenoniene\"><em>Jurgitos Vai\u010denonien\u0117s<\/em><\/a> \u2013 straipsnis \u201e<a href=\"https:\/\/kalbos.ktu.lt\/index.php\/KStud\/article\/view\/40544\">Nauj\u0173 lietuvi\u0173 kalbos anotuot\u0173 tekstyn\u0173 rengimas: sandaros aspektai<\/a>\u201c, kuriame pristatomos Lietuvos tekstyn\u0173 lingvistikos i\u0161takos, atliekama gramati\u0161kai anotuot\u0173 lietuvi\u0173 kalbos tekstyn\u0173 ap\u017evalga, analizuojama kit\u0173 kalb\u0173 anotuot\u0173 tekstyn\u0173 situacija, nagrin\u0117jama nauj\u0173 gramati\u0161kai anotuot\u0173 tekstyn\u0173 sandara, pateikiama i\u0161sami tekstyno vieneto sampratos analiz\u0117.<\/p>\n\n\n\n<p>Pirmieji tekstyn\u0173 lingvistikos tyrimai Lietuvoje prad\u0117ti rengti <a href=\"https:\/\/www.vdu.lt\/lt\/\">VDU<\/a> <em>Kompiuterin\u0117s lingvistikos centre<\/em>, \u0161iuo metu \u012fvairaus pob\u016bd\u017eio vie\u0161ai prieinam\u0173 tekstyn\u0173 ir kit\u0173 kalbini\u0173 i\u0161tekli\u0173 (duomen\u0173 bazi\u0173, \u017eodyn\u0173, kalbos analiz\u0117s \u012franki\u0173 ir pan.) gausa pateikiama<em> <\/em><a href=\"https:\/\/sitti.vdu.lt\/\">Skaitmenini\u0173 i\u0161tekli\u0173 ir tarpdisciplinini\u0173 tyrim\u0173 instituto<\/a> tinklalapyje ir <a href=\"https:\/\/clarin.vdu.lt\/xmlui\/?locale-attribute=lt\">CLARIN-LT saugykloje<\/a>.<\/p>\n\n\n\n<p>Tyr\u0117jos straipsnyje atskleid\u017eia anotuot\u0173 tekstyn\u0173 svarb\u0105, pristato morfologi\u0161kai anotuota tekstyn\u0105 \u201e<a href=\"https:\/\/sitti.vdu.lt\/matas-morfologiskai-anotuotas-tekstynas\/\">Matas<\/a>\u201c, automati\u0161kai mor\u00adfologi\u0161kai anotuot\u0105 \u201e<a href=\"http:\/\/tekstynas.vdu.lt\/tekstynas\/\">Dabartin\u0117s lietuvi\u0173 kalbos tekstyn\u0105<\/a>\u201c, lietuvi\u0173 kalbos sintaksi\u0161kai anotuota tekstyn\u0105 \u201e<a href=\"https:\/\/sitti.vdu.lt\/alksnis-sintaksiskai-anotuotas-tekstynas\/\">Alksnis<\/a>\u201c, lietuvi\u0173 kalbos morfologin\u0117s analiz\u0117s ir sintez\u0117s \u012frank\u012f \u201e<a href=\"https:\/\/sitti.vdu.lt\/morfuoklis\/lt\">Morfuoklis<\/a>\u201c (i\u0161samiau apie analiz\u0117s ir sintez\u0117s funkcijas skaitykite <a href=\"https:\/\/www.vdu.lt\/cris\/entities\/person\/erika-rimkute\">Erikos Rimkut\u0117s<\/a> ir <a href=\"https:\/\/www.vdu.lt\/cris\/entities\/person\/virginijus-dadurkevicius\/datasets\">Virginijaus Dadurkevi\u010diaus<\/a> interviu <a href=\"https:\/\/www.clarin.eu\/blog\/tour-de-clarin-interview-erika-rimkute-and-virginijus-dadurkevicius\">Tour de CLARIN<\/a> leidinyje). Taip pat apra\u0161o Europos S\u0105jungos 2024\u20132026 m. vykdoma <a href=\"https:\/\/next-generation-eu.europa.eu\/index_lt\">NextGenerationEU<\/a> projekt\u0105 \u201e<a href=\"https:\/\/sitti.vdu.lt\/morfologiskai-ir-sintaksiskai-anotuotu-tekstynu-modeliai-dirbtinio-intelekto-apmokymui\/\">Morfologi\u0161kai ir sintaksi\u0161kai anotuot\u0173 tekstyn\u0173 modeliai apmokymui (auksiniai standartai)<\/a>\u201c.<\/p>\n\n\n\n<p>Straipsnyje pateikiama detali gramati\u0161kai anotuot\u0173 tekstyn\u0173 raida Lietuvoje, supa\u017eindinama su morfologi\u0161kai anotuoto lietuvi\u0173 kalbos tekstyno \u201e<a href=\"https:\/\/sitti.vdu.lt\/matas-morfologiskai-anotuotas-tekstynas\/\">Matas<\/a>\u201c ir sintaksi\u0161kai anotuoto lietuvi\u0173 kalbos tekstyno \u201e<a href=\"https:\/\/sitti.vdu.lt\/alksnis-sintaksiskai-anotuotas-tekstynas\/\">Alksnis<\/a>\u201c sudarymo procesu ir esminiais ypatumais. Minimi tarptautiniai standartai (CoNLL-U, MULTEXT-East, PDT (Prague Dependency Treebank), UD (<a href=\"https:\/\/universaldependencies.org\/\">Universal Dependency<\/a>)) ir lietuvi\u0161kas standartas \u201e<a href=\"https:\/\/sitti.vdu.lt\/jablonskis-lt\/\">Jablonskis<\/a>\u201c). Be to, pristatomi kit\u0173 kalb\u0173 anotuoti tekstynai, apra\u0161omi j\u0173 dyd\u017eiai ir sandara, atliekamas tekstyn\u0173 palyginimas. Autor\u0117s akcentuoja tekstyn\u0173 palyginamum\u0105 apsunkinan\u010dius aspektus ir pasi\u016blo sprendim\u0105 kaip to i\u0161vengti. Taip pat pateikia daugiausiai anotuot\u0173 tekstyn\u0173 turin\u010di\u0173 \u0161ali\u0173 s\u0105ra\u0161\u0105, supa\u017eindina su pirmaujan\u010diais anotuotais tekstynais pagal dyd\u012f ir lygina juos su angl\u0173 kalbos tekstynais (pagal dydi ir sandar\u0105).<\/p>\n\n\n\n<p>Skaitytojai supa\u017eindinami su nauj\u0173 gramati\u0161kai anotuot\u0173 tekstyn\u0173 sandara, proporcijomis, tekst\u0173 tip\u0173, stili\u0173, \u017eanr\u0173 ypatyb\u0117mis. Aptariami administracin\u0117s, mokslin\u0117s, gro\u017ein\u0117s literat\u016bros tekstai, nurodomi tekst\u0173 naudojimo apribojimai. I\u0161ai\u0161kinami teigiami ir neigiami tekstyn\u0173 sudarymo aspektai taikant skirtingas sudarymo strategijas: i\u0161 piln\u0173 tekst\u0173 ir i\u0161 fragment\u0173. Pateikiamos tekstyno vieneto sampratos peripetijos. Paai\u0161kinamos s\u0105vokos: tekstyno suskai\u00addymas \u012f tekstyno vienetus (angl. <em>tokenization<\/em>), tekstyno vienetas (<em>token<\/em>), \u017eodis (<em>word<\/em>), ne\u017eod\u017eis (angl. <em>non-word<\/em>). Aptariami atvejai kai reik\u0161minis vienetas susideda i\u0161 keli\u0173 \u017eod\u017ei\u0173, ir atvirk\u0161\u010diai, kai vienas \u017eodis apima du reik\u0161minius vienetus. Ap\u017evelgiami i\u0161\u0161\u016bkius keliantys teksto elementai \u2013 simboliai, skaitmenys, trumpiniai, skyrybos \u017eenklais, pvz., <em>3M, i600, FB, 25-hour, !mportant<\/em>. Pateikiamas i\u0161samus problemini\u0173 atvej\u0173 s\u0105ra\u0161as aktualus lietuvi\u0173 kalbai su paai\u0161kinimais. Taip pat autor\u0117s akcentavo, kad teksto skaldymas \u012f tekstyno vienetus yra problemati\u0161kas dar ir d\u0117l pasirenkamos programin\u0117s \u012frangos (<em>AntConc<\/em>, <em>LancsBox, SketchEngine<\/em>), nes programos skirtingai traktuoja tekstyno vienetus, tod\u0117l gaunami nevienodi rezultatai.<\/p>\n\n\n\n<p>Sekite\u00a0<a href=\"https:\/\/clarin-lt.lt\/\">CLARIN-LT<\/a>\u00a0naujienas m\u016bs\u0173\u00a0<a href=\"https:\/\/www.facebook.com\/profile.php?id=100087289837974\">Facebook paskyroje<\/a>\u00a0ir\u00a0<a href=\"https:\/\/clarin-lt.lt\/?page_id=104\">interneto tinklalapyje<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13.png\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"18\" src=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13.png\" alt=\"\" class=\"wp-image-3216\" srcset=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13.png 975w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-300x6.png 300w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-768x14.png 768w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-100x2.png 100w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-150x3.png 150w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-200x4.png 200w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-450x8.png 450w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-600x11.png 600w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2025\/11\/image-13-900x17.png 900w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/a><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>2025 liepos 23 d. moksliniame \u017eurnale \u201eKalb\u0173 studijos\u201c (angl. Studies about languages) publikuotas CLARIN-LT tyr\u0117j\u0173 \u2013 Jolantos Kovalevskait\u0117s, Erikos Rimkut\u0117s, Jurgitos Vai\u010denonien\u0117s \u2013 straipsnis \u201eNauj\u0173 lietuvi\u0173 kalbos anotuot\u0173 tekstyn\u0173 rengimas: sandaros aspektai\u201c, kuriame pristatomos Lietuvos tekstyn\u0173 lingvistikos i\u0161takos, atliekama gramati\u0161kai<span class=\"ellipsis\">&hellip;<\/span><\/p>\n<div class=\"read-more\"><a href=\"https:\/\/clarin-lt.lt\/?p=3213\">Read more &#8250;<\/a><\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":7,"featured_media":3214,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-3213","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3213","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3213"}],"version-history":[{"count":8,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3213\/revisions"}],"predecessor-version":[{"id":3224,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3213\/revisions\/3224"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/media\/3214"}],"wp:attachment":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3213"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3213"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3213"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}