{"id":3986,"date":"2026-06-15T07:51:07","date_gmt":"2026-06-15T05:51:07","guid":{"rendered":"https:\/\/clarin-lt.lt\/?p=3986"},"modified":"2026-06-15T10:12:58","modified_gmt":"2026-06-15T08:12:58","slug":"naujas-clarin-lt-isteklius-tekstynas-simas","status":"publish","type":"post","link":"https:\/\/clarin-lt.lt\/?p=3986","title":{"rendered":"Naujas CLARIN-LT i\u0161teklius \u2013 tekstynas \u201eSIMAS\u201c"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7.png\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"18\" src=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7.png\" alt=\"\" class=\"wp-image-3988\" srcset=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7.png 975w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-300x6.png 300w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-768x14.png 768w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-100x2.png 100w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-150x3.png 150w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-200x4.png 200w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-450x8.png 450w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-600x11.png 600w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-7-900x17.png 900w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/a><\/figure>\n\n\n\n<p>2025 m. gegu\u017e\u0117s m\u0117n. <a href=\"https:\/\/clarin-repo.lt\/home\">CLARIN-LT saugykloje<\/a> patalpintas ir vie\u0161ai prieinamas <a href=\"https:\/\/hdl.handle.net\/20.500.11821\/105\">morfologi\u0161kai ir sintaksi\u0161kai anotuotas lietuvi\u0173 kalbos tekstynas \u201eSIMAS\u201c<\/a> \u2013 tai <a href=\"https:\/\/sitti.vdu.lt\/\">VDU Skaitmenini\u0173 i\u0161tekli\u0173 ir tarpdisciplinini\u0173 tyrim\u0173 instituto (SITTI)<\/a> vykdomo <a href=\"https:\/\/commission.europa.eu\/strategy-and-policy\/recovery-plan-europe_lt\">Europos S\u0105jungos NextGenerationEU<\/a> projekto \u201eMorfologi\u0161kai ir sintaksi\u0161kai anotuot\u0173 tekstyn\u0173 modeliai apmokymui (auksiniai standartai)\u201c (Nr. 02-098-K-0001) rezultatas. <strong>Projekto vadov\u0117 doc. dr. <a href=\"https:\/\/hdl.handle.net\/20.500.12259\/154977\">Erika Rimkut\u0117<\/a><\/strong>. \u0160iuo projektu skatinamas lietuvi\u0173 kalbos technologini\u0173 inovacij\u0173 vystymas ir suteikiama tiesiogin\u0117 nauda visuomenei, valstyb\u0117s institucijoms ir verslo sektoriui (daugiau informacijos rasite <a href=\"https:\/\/sitti.vdu.lt\/morfologiskai-ir-sintaksiskai-anotuotu-tekstynu-modeliai-dirbtinio-intelekto-apmokymui\/\">\u010dia<\/a>).<\/p>\n\n\n\n<p>Tekstynas \u201eSimas\u201c susideda i\u0161 2005\u20132025 m. lietuvi\u0173 autori\u0173 para\u0161yt\u0173 originali\u0173 gro\u017ein\u0117s, mokslin\u0117s, administracin\u0117s literat\u016bros ir periodikos tekst\u0173. Be to, \u0161is tekstynas sudarytas i\u0161 piln\u0173 tekst\u0173, o ne i\u0161 tekst\u0173 fragment\u0173. Tai automati\u0161kai morfologi\u0161kai ir sintakti\u0161kai anotuotas tekstynas, kur\u012f lingvistai patikrino rankomis. Automatiniam morfologiniam anotavimui naudotas \u012frankis \u201e<a href=\"https:\/\/sitti.vdu.lt\/morfuoklis\/lt\">Morfuoklis<\/a>\u201c, o automatinei sintaksinei analizei atlikti naudotas lietuvi\u0173 kalbai adaptuotas tarptautinis \u012frankis \u201e<a href=\"https:\/\/lindat.mff.cuni.cz\/services\/udpipe\/\">UDPipe<\/a>\u201c. Tekstynas anotuotas tarptautiniu <a href=\"https:\/\/universaldependencies.org\/lt\/\">Universali\u0173j\u0173 priklausomybi\u0173 standartu<\/a>, morfologinio anotavimo standartu \u201e<a href=\"https:\/\/sitti.vdu.lt\/jablonskis-lt\/\">Jablonskis<\/a>\u201c ir sintaksin\u0117s analiz\u0117s <a href=\"https:\/\/sitti.vdu.lt\/wp-content\/uploads\/2026\/05\/UD_standarto_gaires.pdf\">Universali\u0173j\u0173 priklausomybi\u0173 standartu: lietuvi\u0173 kalbos sintaksinio anotavimo gair\u0117s<\/a>. Tekstyno dydis \u2013 daugiau nei 10 mln. \u017eod\u017ei\u0173 (tiksliau, 10 010 420 \u017eod\u017ei\u0173 arba 12 221 575 tekstyno vienet\u0173 (touken\u0173)).<\/p>\n\n\n\n<p>Susipa\u017einkite i\u0161samiau su <a href=\"https:\/\/sitti.vdu.lt\/morfologiskai-ir-sintaksiskai-anotuotu-tekstynu-modeliai-dirbtinio-intelekto-apmokymui\/\">morfologi\u0161kai ir sintaksi\u0161kai anotuot\u0173 tekstyn\u0173 modeliais apmokymui<\/a>.<\/p>\n\n\n\n<p>Taip pat skaitykite <a href=\"https:\/\/hmf.vdu.lt\/\">VDU humanitarini\u0173 moksl\u0173 fakulteto<\/a> tinklalapyje paskelbt\u0105 <a href=\"https:\/\/hdl.handle.net\/20.500.12259\/283643\">Migl\u0117s \u017demriet\u0117s<\/a> straipsn\u012f \u201e<a href=\"https:\/\/hmf.vdu.lt\/morfologiskai-ir-sintaksiskai-anotuotas-lietuviu-kalbos-tekstynas-simas\/\">Morfologi\u0161kai ir sintaksi\u0161kai anotuotas lietuvi\u0173 kalbos tekstynas SIMAS<\/a>\u201c, kuriame su\u017einosite apie lietuvi\u0173 kalbos technologini\u0173 i\u0161tekli\u0173 pad\u0117t\u012f bei technologini\u0173 projekt\u0173 poreik\u012f ir j\u0173 svarb\u0105 Lietuvai. Susipa\u017einsite su prie tekstyno \u201eSimas\u201c dirbusi\u0173 mokslinink\u0173 ir student\u0173 komanda, darbo u\u017ekulisiais, \u012fsp\u016bd\u017eiais ir &nbsp;i\u0161\u0161\u016bkiais.<\/p>\n\n\n\n<p>Sekite&nbsp;CLARIN-LT&nbsp;naujienas m\u016bs\u0173&nbsp;<a href=\"https:\/\/www.facebook.com\/profile.php?id=100087289837974\">Facebook paskyroje<\/a>&nbsp;ir&nbsp;<a href=\"https:\/\/clarin-lt.lt\/?page_id=104\">interneto tinklalapyje<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8.png\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"18\" src=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8.png\" alt=\"\" class=\"wp-image-3989\" srcset=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8.png 975w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-300x6.png 300w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-768x14.png 768w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-100x2.png 100w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-150x3.png 150w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-200x4.png 200w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-450x8.png 450w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-600x11.png 600w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2026\/06\/image-8-900x17.png 900w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/a><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>2025 m. gegu\u017e\u0117s m\u0117n. CLARIN-LT saugykloje patalpintas ir vie\u0161ai prieinamas morfologi\u0161kai ir sintaksi\u0161kai anotuotas lietuvi\u0173 kalbos tekstynas \u201eSIMAS\u201c \u2013 tai VDU Skaitmenini\u0173 i\u0161tekli\u0173 ir tarpdisciplinini\u0173 tyrim\u0173 instituto (SITTI) vykdomo Europos S\u0105jungos NextGenerationEU projekto \u201eMorfologi\u0161kai ir sintaksi\u0161kai anotuot\u0173 tekstyn\u0173 modeliai apmokymui<span class=\"ellipsis\">&hellip;<\/span><\/p>\n<div class=\"read-more\"><a href=\"https:\/\/clarin-lt.lt\/?p=3986\">Read more &#8250;<\/a><\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":7,"featured_media":3987,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-3986","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3986","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3986"}],"version-history":[{"count":5,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3986\/revisions"}],"predecessor-version":[{"id":4000,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/3986\/revisions\/4000"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/media\/3987"}],"wp:attachment":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3986"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3986"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3986"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}