|
|
|
|
LEADER |
02475nmi a2200517 4500 |
001 |
0-1748601830 |
003 |
DE-627 |
005 |
20210218092806.0 |
006 |
su| d|o |0 |0 |
007 |
cr uuu---uuuuu |
008 |
210218s2021 xx |o | eng c |
024 |
7 |
|
|a 10.11588/data/HVXXIJ
|2 doi
|
035 |
|
|
|a (DE-627)1748601830
|
035 |
|
|
|a (DE-599)KXP1748601830
|
040 |
|
|
|a DE-627
|b ger
|c DE-627
|e rda
|
041 |
|
|
|a eng
|
100 |
1 |
|
|a Daza, Angel
|d 1989-
|e VerfasserIn
|0 (DE-588)1203323360
|0 (DE-627)1688152938
|4 aut
|
245 |
1 |
0 |
|a X-SRL dataset and mBERT word aligner
|c Angel Daza
|
264 |
|
1 |
|a Heidelberg
|b Universität
|c 2021-02-17
|
300 |
|
|
|a 1 Online-Ressource (2 Files)
|
336 |
|
|
|a Text
|b txt
|2 rdacontent
|
336 |
|
|
|a Computerdaten
|b cod
|2 rdacontent
|
337 |
|
|
|a Computermedien
|b c
|2 rdamedia
|
338 |
|
|
|a Online-Ressource
|b cr
|2 rdacarrier
|
500 |
|
|
|a Kind of data: Program source code
|
500 |
|
|
|a Gesehen am 18.02.2021
|
520 |
|
|
|a This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages.
|
655 |
|
7 |
|a Forschungsdaten
|0 (DE-588)1098579690
|0 (DE-627)857755366
|0 (DE-576)469182156
|2 gnd-content
|
655 |
|
7 |
|a Datenbank
|0 (DE-588)4011119-2
|0 (DE-627)106354256
|0 (DE-576)208891943
|2 gnd-content
|
787 |
0 |
8 |
|i Forschungsdaten zu
|a Daza, Angel, 1989 -
|t X-SRL
|d 2020
|w (DE-627)1748602551
|
856 |
4 |
0 |
|u https://doi.org/10.11588/data/HVXXIJ
|x Verlag
|x Resolving-System
|z lizenzpflichtig
|3 Volltext
|
856 |
4 |
0 |
|u https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ
|x Verlag
|z kostenfrei
|3 Volltext
|
951 |
|
|
|a BO
|
856 |
4 |
0 |
|u https://doi.org/10.11588/data/HVXXIJ
|9 LFER
|
856 |
4 |
0 |
|u https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ
|9 LFER
|
852 |
|
|
|a LFER
|z 2021-03-09T06:32:08Z
|
970 |
|
|
|c OD
|
971 |
|
|
|c EBOOK
|
972 |
|
|
|c EBOOK
|
973 |
|
|
|c EB
|
935 |
|
|
|a lfer
|
900 |
|
|
|a Daza Arévalo, José Angel
|
900 |
|
|
|a Arévalo, José Angel Daza
|
900 |
|
|
|a Daza, José Angel
|
951 |
|
|
|b XA-DE
|
980 |
|
|
|a 1748601830
|b 0
|k 1748601830
|c lfer
|
SOLR
_version_ |
1750407520198852608 |
access_facet |
Electronic Resources |
author |
Daza, Angel |
author_facet |
Daza, Angel |
author_role |
aut |
author_sort |
Daza, Angel 1989- |
author_variant |
a d ad |
callnumber-sort |
|
collection |
lfer |
contents |
This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages. |
ctrlnum |
(DE-627)1748601830, (DE-599)KXP1748601830 |
doi_str_mv |
10.11588/data/HVXXIJ |
facet_avail |
Online, Free |
finc_class_facet |
not assigned |
footnote |
Kind of data: Program source code, Gesehen am 18.02.2021 |
format |
OnlineResource, ComputerDataset, Database |
format_de105 |
Ebook |
format_de14 |
Website |
format_de15 |
Website |
format_del152 |
Buch |
format_detail_txtF_mv |
unspecified-online-integrating-independent |
format_finc |
Book, E-Book, Software, Database |
format_legacy |
ElectronicIntegratingResource |
format_legacy_nrw |
Website |
format_nrw |
Website |
genre |
Forschungsdaten (DE-588)1098579690 (DE-627)857755366 (DE-576)469182156 gnd-content, Datenbank (DE-588)4011119-2 (DE-627)106354256 (DE-576)208891943 gnd-content |
genre_facet |
Forschungsdaten, Datenbank |
geogr_code |
not assigned |
geogr_code_person |
Germany |
id |
0-1748601830 |
illustrated |
Not Illustrated |
imprint |
Heidelberg, Universität, 2021-02-17 |
imprint_str_mv |
Heidelberg: Universität, 2021-02-17 |
institution |
DE-D117, DE-105, LFER, DE-Ch1, DE-15, DE-14, DE-Zwi2 |
is_hierarchy_id |
|
is_hierarchy_title |
|
isil_str_mv |
LFER |
kxp_id_str |
1748601830 |
language |
English |
last_indexed |
2022-11-24T19:38:14.667Z |
marc024a_ct_mv |
10.11588/data/HVXXIJ |
match_str |
daza2021xsrldatasetandmbertwordaligner |
mega_collection |
Verbunddaten SWB, Lizenzfreie Online-Ressourcen |
misc_de105 |
EBOOK |
names_id_str_mv |
(DE-588)1203323360, (DE-627)1688152938 |
physical |
1 Online-Ressource (2 Files) |
publishDate |
2021-02-17 |
publishDateSort |
2021 |
publishPlace |
Heidelberg |
publisher |
Universität |
record_format |
marcfinc |
record_id |
1748601830 |
recordtype |
marcfinc |
rvk_facet |
No subject assigned |
source_id |
0 |
spelling |
Daza, Angel 1989- VerfasserIn (DE-588)1203323360 (DE-627)1688152938 aut, X-SRL dataset and mBERT word aligner Angel Daza, Heidelberg Universität 2021-02-17, 1 Online-Ressource (2 Files), Text txt rdacontent, Computerdaten cod rdacontent, Computermedien c rdamedia, Online-Ressource cr rdacarrier, Kind of data: Program source code, Gesehen am 18.02.2021, This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages., Forschungsdaten (DE-588)1098579690 (DE-627)857755366 (DE-576)469182156 gnd-content, Datenbank (DE-588)4011119-2 (DE-627)106354256 (DE-576)208891943 gnd-content, Forschungsdaten zu Daza, Angel, 1989 - X-SRL 2020 (DE-627)1748602551, https://doi.org/10.11588/data/HVXXIJ Verlag Resolving-System lizenzpflichtig Volltext, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ Verlag kostenfrei Volltext, https://doi.org/10.11588/data/HVXXIJ LFER, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ LFER, LFER 2021-03-09T06:32:08Z |
spellingShingle |
Daza, Angel, X-SRL dataset and mBERT word aligner, This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages., Forschungsdaten, Datenbank |
title |
X-SRL dataset and mBERT word aligner |
title_auth |
X-SRL dataset and mBERT word aligner |
title_full |
X-SRL dataset and mBERT word aligner Angel Daza |
title_fullStr |
X-SRL dataset and mBERT word aligner Angel Daza |
title_full_unstemmed |
X-SRL dataset and mBERT word aligner Angel Daza |
title_short |
X-SRL dataset and mBERT word aligner |
title_sort |
x srl dataset and mbert word aligner |
topic |
Forschungsdaten, Datenbank |
topic_facet |
Forschungsdaten, Datenbank |
url |
https://doi.org/10.11588/data/HVXXIJ, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ |