X-SRL dataset and mBERT word aligner

Gespeichert in:

Personen und Körperschaften:	Daza, Angel (VerfasserIn)
Titel:	X-SRL dataset and mBERT word aligner/ Angel Daza
Format:	OnlineResource Computerdaten Datenbank
Sprache:	Englisch
veröffentlicht:	Heidelberg Universität 2021-02-17
Schlagwörter:	Forschungsdaten Datenbank
Quelle:	Verbunddaten SWB Lizenzfreie Online-Ressourcen
Forschungsdaten zu:	Daza, Angel, 1989 - : X-SRL


LEADER	02475nmi a2200517 4500
001	0-1748601830
003	DE-627
005	20210218092806.0
006	su\| d\|o \|0 \|0
007	cr uuu---uuuuu
008	210218s2021 xx \|o \| eng c
024	7		\|a 10.11588/data/HVXXIJ \|2 doi
035			\|a (DE-627)1748601830
035			\|a (DE-599)KXP1748601830
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
100	1		\|a Daza, Angel \|d 1989- \|e VerfasserIn \|0 (DE-588)1203323360 \|0 (DE-627)1688152938 \|4 aut
245	1	0	\|a X-SRL dataset and mBERT word aligner \|c Angel Daza
264		1	\|a Heidelberg \|b Universität \|c 2021-02-17
300			\|a 1 Online-Ressource (2 Files)
336			\|a Text \|b txt \|2 rdacontent
336			\|a Computerdaten \|b cod \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
500			\|a Kind of data: Program source code
500			\|a Gesehen am 18.02.2021
520			\|a This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages.
655		7	\|a Forschungsdaten \|0 (DE-588)1098579690 \|0 (DE-627)857755366 \|0 (DE-576)469182156 \|2 gnd-content
655		7	\|a Datenbank \|0 (DE-588)4011119-2 \|0 (DE-627)106354256 \|0 (DE-576)208891943 \|2 gnd-content
787	0	8	\|i Forschungsdaten zu \|a Daza, Angel, 1989 - \|t X-SRL \|d 2020 \|w (DE-627)1748602551
856	4	0	\|u https://doi.org/10.11588/data/HVXXIJ \|x Verlag \|x Resolving-System \|z lizenzpflichtig \|3 Volltext
856	4	0	\|u https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ \|x Verlag \|z kostenfrei \|3 Volltext
951			\|a BO
856	4	0	\|u https://doi.org/10.11588/data/HVXXIJ \|9 LFER
856	4	0	\|u https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ \|9 LFER
852			\|a LFER \|z 2021-03-09T06:32:08Z
970			\|c OD
971			\|c EBOOK
972			\|c EBOOK
973			\|c EB
935			\|a lfer
900			\|a Daza Arévalo, José Angel
900			\|a Arévalo, José Angel Daza
900			\|a Daza, José Angel
951			\|b XA-DE
980			\|a 1748601830 \|b 0 \|k 1748601830 \|c lfer

openURL	url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=X-SRL+dataset+and+mBERT+word+aligner&rft.date=2021-02-17&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.creator=Daza%2C+Angel&rft.pub=Universit%C3%A4t&rft.format=OnlineResource&rft.language=English

SOLR
_version_	1750407520198852608
access_facet	Electronic Resources
author	Daza, Angel
author_facet	Daza, Angel
author_role	aut
author_sort	Daza, Angel 1989-
author_variant	a d ad
callnumber-sort
collection	lfer
contents	This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages.
ctrlnum	(DE-627)1748601830, (DE-599)KXP1748601830
doi_str_mv	10.11588/data/HVXXIJ
facet_avail	Online, Free
finc_class_facet	not assigned
footnote	Kind of data: Program source code, Gesehen am 18.02.2021
format	OnlineResource, ComputerDataset, Database
format_de105	Ebook
format_de14	Website
format_de15	Website
format_del152	Buch
format_detail_txtF_mv	unspecified-online-integrating-independent
format_finc	Book, E-Book, Software, Database
format_legacy	ElectronicIntegratingResource
format_legacy_nrw	Website
format_nrw	Website
genre	Forschungsdaten (DE-588)1098579690 (DE-627)857755366 (DE-576)469182156 gnd-content, Datenbank (DE-588)4011119-2 (DE-627)106354256 (DE-576)208891943 gnd-content
genre_facet	Forschungsdaten, Datenbank
geogr_code	not assigned
geogr_code_person	Germany
id	0-1748601830
illustrated	Not Illustrated
imprint	Heidelberg, Universität, 2021-02-17
imprint_str_mv	Heidelberg: Universität, 2021-02-17
institution	DE-D117, DE-105, LFER, DE-Ch1, DE-15, DE-14, DE-Zwi2
is_hierarchy_id
is_hierarchy_title
isil_str_mv	LFER
kxp_id_str	1748601830
language	English
last_indexed	2022-11-24T19:38:14.667Z
marc024a_ct_mv	10.11588/data/HVXXIJ
match_str	daza2021xsrldatasetandmbertwordaligner
mega_collection	Verbunddaten SWB, Lizenzfreie Online-Ressourcen
misc_de105	EBOOK
names_id_str_mv	(DE-588)1203323360, (DE-627)1688152938
physical	1 Online-Ressource (2 Files)
publishDate	2021-02-17
publishDateSort	2021
publishPlace	Heidelberg
publisher	Universität
record_format	marcfinc
record_id	1748601830
recordtype	marcfinc
rvk_facet	No subject assigned
source_id	0
spelling	Daza, Angel 1989- VerfasserIn (DE-588)1203323360 (DE-627)1688152938 aut, X-SRL dataset and mBERT word aligner Angel Daza, Heidelberg Universität 2021-02-17, 1 Online-Ressource (2 Files), Text txt rdacontent, Computerdaten cod rdacontent, Computermedien c rdamedia, Online-Ressource cr rdacarrier, Kind of data: Program source code, Gesehen am 18.02.2021, This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages., Forschungsdaten (DE-588)1098579690 (DE-627)857755366 (DE-576)469182156 gnd-content, Datenbank (DE-588)4011119-2 (DE-627)106354256 (DE-576)208891943 gnd-content, Forschungsdaten zu Daza, Angel, 1989 - X-SRL 2020 (DE-627)1748602551, https://doi.org/10.11588/data/HVXXIJ Verlag Resolving-System lizenzpflichtig Volltext, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ Verlag kostenfrei Volltext, https://doi.org/10.11588/data/HVXXIJ LFER, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ LFER, LFER 2021-03-09T06:32:08Z
spellingShingle	Daza, Angel, X-SRL dataset and mBERT word aligner, This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages., Forschungsdaten, Datenbank
title	X-SRL dataset and mBERT word aligner
title_auth	X-SRL dataset and mBERT word aligner
title_full	X-SRL dataset and mBERT word aligner Angel Daza
title_fullStr	X-SRL dataset and mBERT word aligner Angel Daza
title_full_unstemmed	X-SRL dataset and mBERT word aligner Angel Daza
title_short	X-SRL dataset and mBERT word aligner
title_sort	x srl dataset and mbert word aligner
topic	Forschungsdaten, Datenbank
topic_facet	Forschungsdaten, Datenbank
url	https://doi.org/10.11588/data/HVXXIJ, https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/HVXXIJ