{"id":205,"date":"2016-01-18T09:47:58","date_gmt":"2016-01-18T07:47:58","guid":{"rendered":"http:\/\/158.129.51.247:8888\/?p=205"},"modified":"2025-02-27T22:34:57","modified_gmt":"2025-02-27T20:34:57","slug":"rengiamas-lietuviu-kalbos-sintaksiskai-anotuotas-tekstynas-alksnis","status":"publish","type":"post","link":"https:\/\/clarin-lt.lt\/?p=205","title":{"rendered":"Rengiamas lietuvi\u0173 kalbos sintaksi\u0161kai anotuotas tekstynas ALKSNIS"},"content":{"rendered":"<p><a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1.jpg\" rel=\"attachment wp-att-215\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-215 alignleft\" src=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1.jpg\" alt=\"MEDIS1\" width=\"262\" height=\"660\" srcset=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1.jpg 262w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1-119x300.jpg 119w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1-100x252.jpg 100w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1-150x378.jpg 150w, https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/MEDIS1-200x504.jpg 200w\" sizes=\"auto, (max-width: 262px) 100vw, 262px\" \/><\/a>Vytauto Did\u017eiojo universiteto Kompiuterin\u0117s lingvistikos centro mokslininkai 2015 m. prad\u0117jo rengti lietuvi\u0173 kalbos sintaksi\u0161kai anotuot\u0105 tekstyn\u0105 (angl. <em>treebank<\/em>; toliau vartojame akronim\u0105 ALKSNIS, t. y. anotuotas lietuvi\u0173 kalbos sintaksinis tekstynas). Tai viena i\u0161 projekto<em> Lietuvos naryst\u0117 tarptautin\u0117je mokslini\u0173 tyrim\u0173 infrastrukt\u016broje \u2013 Bendrosios kalbos i\u0161tekli\u0173 ir technologij\u0173 infrastrukt\u016bra Europos mokslini\u0173 tyrim\u0173 infrastrukt\u016bros konsorciumas<\/em> veikl\u0173.<\/p>\n<p>ALKSN\u012e sudarys apie 2300 sintaksi\u0161kai anotuot\u0173 sakini\u0173 (i\u0161 bendrosios ir specialiosios periodikos, gro\u017ein\u0117s ir administracin\u0117s literat\u016bros). Tekstyno tvarkymas bus baigtas 2016 m. pabaigoje. ALKSNIO pagrindas \u2013 lietuvi\u0173 kalbos sintaksiniu analizatoriumi sugeneruoti sintaksini\u0173 priklausomybi\u0173 med\u017eiai (angl. <em>dependency trees<\/em>) PML (angl. <em>Prague Markup Language<\/em>) formatu. \u0160is formatas leid\u017eia vizualizuoti ir redaguoti sintaksinius med\u017eius naudojant TrED<a href=\"#_ftn1\" name=\"_ftnref1\">[1]<\/a> redaktori\u0173.<\/p>\n<p>Kiekviena med\u017eio vir\u0161\u016bn\u0117 atitinka sakinio \u017eod\u012f, skyrybos \u017eenkl\u0105 ar kit\u0105 sakinio vienet\u0105 (simbol\u012f, skaitmen\u012f ir pan.). Prie vis\u0173 \u017eod\u017ei\u0173 tokia eil\u0117s tvarka nurodoma: 1) konkreti sakinyje pavartota \u017eod\u017eio forma, 2) antra\u0161tin\u0117, t. y. \u017eodynin\u0117, forma, dar kitaip vadinama lema, 3) morfologin\u0117s pa\u017eymos (kalbos dalis ir gramatiniai po\u017eymiai) ir 4) <a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/sintaksines_pazymos.pdf\">sintaksin\u0117 funkcija (subjektas, objektas ir t. t.). <\/a>Priklausomyb\u0117s ry\u0161iai tarp \u017eod\u017ei\u0173 yra nurodomi briaunomis.<\/p>\n<p>ALKSNYJE nurodomos morfologin\u0117s pa\u017eymos, sudarytos remiantis MULTEXT-East formato<a href=\"#_ftn2\" name=\"_ftnref2\">[2]<\/a> pavyzd\u017eiu. Sintaksi\u0161kai anotuoti sakiniai tvarkomi pagal VDU KLC rengiamas gaires, kurios sudarytos remiantis Prahos priklausomybi\u0173 med\u017ei\u0173 banko (angl. <em>Prague Dependency Treebank<\/em>) anotavimo taisykl\u0117mis. Visi automati\u0161kai anotuoti sakiniai yra tikrinami ir rankomis taisomi kalbinink\u0173 grup\u0117s.<\/p>\n<p>Pateikiame dal\u012f iki \u0161iol sintaksi\u0161kai anotuot\u0173 sakini\u0173 (sakiniai nuolatos yra tvarkomi, tikslinamos pa\u017eymos, tod\u0117l ateityje bus pateikta atnaujinti duomenys). Norint atsidaryti <a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/pml_failai.7z\">failus su pl\u0117tiniu .pml<\/a>, reikia \u012fsidiegti TrED redaktori\u0173, prie anotuot\u0173 fail\u0173 \u012fsikelti \u0161io redaktoriaus stiliaus fail\u0105 \u201e<a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/antisDplus_schema.7z\">antisDplus_schema<\/a>\u201c.\u00a0\u012esidiegus ir pirm\u0105 kart\u0105 atsidarius TrED redaktori\u0173, reikia nurodyti, koki\u0105 informacij\u0105 norite matyti prie kiekvienos sintaksinio med\u017eio vir\u0161\u016bn\u0117s. Reikia paspausti burt\u0173 lazdel\u0117s paveiksliuk\u0105 de\u0161iniame kampe vir\u0161uje prie \u201eStyle:\u201c) ir sura\u0161yti tok\u012f kod\u0105:<\/p>\n<pre style=\"padding-left: 180px;\">context: .*\nhint:\nnode:${lemma}\nnode:${form}\nnode:${ana}\nnode:${syfun}\ntext:${form}<\/pre>\n<p>I\u0161saugokite \u0161i\u0105\u00a0 informacij\u0105, kad nereik\u0117t\u0173 kiekvien\u0105 kart\u0105 i\u0161 naujo ra\u0161yti kodo.<\/p>\n<p>Neturintiems min\u0117to redaktoriaus rekomenduojame per\u017ei\u016br\u0117ti <a href=\"https:\/\/clarin-lt.lt\/wp-content\/uploads\/2016\/01\/pdf_failai.7z\">pdf failus<\/a>.<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\">[1]<\/a> \u017dr. <a href=\"https:\/\/ufal.mff.cuni.cz\/tred\/\">https:\/\/ufal.mff.cuni.cz\/tred\/<\/a>\u00a0(rekomenduojame atsisi\u0173sti ir \u012fsidiegti versij\u0105 kartu su Strawberry Perl)<\/p>\n<p><a href=\"#_ftnref2\" name=\"_ftn2\">[2]<\/a> \u017dr. <a href=\"http:\/\/nl.ijs.si\/ME\/V4\/msd\/html\/index.html\">http:\/\/nl.ijs.si\/ME\/V4\/msd\/html\/index.html<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Vytauto Did\u017eiojo universiteto Kompiuterin\u0117s lingvistikos centro mokslininkai 2015 m. prad\u0117jo rengti lietuvi\u0173 kalbos sintaksi\u0161kai anotuot\u0105 tekstyn\u0105 (angl. treebank; toliau vartojame akronim\u0105 ALKSNIS, t. y. anotuotas lietuvi\u0173 kalbos sintaksinis tekstynas). Tai viena i\u0161 projekto Lietuvos naryst\u0117 tarptautin\u0117je mokslini\u0173 tyrim\u0173 infrastrukt\u016broje \u2013<span class=\"ellipsis\">&hellip;<\/span><\/p>\n<div class=\"read-more\"><a href=\"https:\/\/clarin-lt.lt\/?p=205\">Read more &#8250;<\/a><\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-205","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=205"}],"version-history":[{"count":14,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/205\/revisions"}],"predecessor-version":[{"id":1674,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=\/wp\/v2\/posts\/205\/revisions\/1674"}],"wp:attachment":[{"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clarin-lt.lt\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}