Analysis and evaluation of Comparable Corpora for Under Resourced Areas of machine Translation

Video Lectures

ACCURAT one of organisers of the Fifth Workshop on Building and Using Comparable Corpora

The 5th Workshop on Building and Using Comparable Corpora was held on 2012-05-26 as the LREC2012 post-conference full day workshop, in Istanbul, Turkey. ACCURAT project was one of the organising projects while ACCURAT partners -- Marko Tadić and Andrejs Vasiļjevs -- were the members of the workshop Ogranising Committee, while Marko Tadić was one of the editors of the Proceedings. The whole workshop was video recorded so the presentations can be watched on-line:

Philipp Petrenz, Bonnie Webber: Robust Cross-Lingual Genre Classification through Comparable Corpora

Qian Yu, François Yvon, Aurélien Max: Revisiting sentence alignment algorithms for alignment visualization and evaluation

Inguna Skadiņa: Analysis and Evaluation of Comparable Corpora for Under-Resourced Areas of Machine Translation

Andrejs Vasiļjevs: LetsMT! – Platform to Drive Development and Application of Statistical Machine Translation

Núria Bel, Vassilis Papavasiliou, Prokopis Prokopidis, Antonio Toral, Victoria Arranz: Mining and Exploiting Domain-Specific Corpora in the PANACEA Platform

Adam Kilgarriff, George Tambouratzis: The PRESEMT Project

Béatrice Daille: Building Bilingual Terminologies from Comparable Corpora: The TTC TermSuite

Aimée Lahaussois, Séverine Guillaume: A viewing and processing tool for the analysis of a comparable corpus of Kiranti mythology

Nancy Ide: MultiMASC: An Open Linguistic Infrastructure for Language Research

Poster Booster Session and Poster Booster Session 2 including:

Elena Irimia: Experimenting with Extracting Lexical Dictionaries from Comparable Corpora for: English-Romanian language pair

Iustina Ilisei, Diana Inkpen, Gloria Corpas, Ruslan Mitkov: Romanian Translational Corpora: Building Comparable Corpora for Translation Studies

Angelina Ivanova: Evaluation of a Bilingual Dictionary Extracted from Wikipedia

Quoc Hung-Ngo, Werner Winiwarter: A Visualizing Annotation Tool for Semi-Automatical Building a Bilingual Corpus

Lene Offersgaard, Dorte Haltrup Hansen: SMT systems for less-resourced languages based on domain-specific data

Magdalena Plamada, Martin Volk: Towards a Wikipedia-extracted Alpine Corpus

Sanja Štajner, Ruslan Mitkov: Using Comparable Corpora to Track Diachronic and Synchronic Changes in Lexical Density and Lexical Richness

Dan Ştefănescu: Mining for Term Translations in Comparable Corpora

George Tambouratzis, Michalis Troullinos, Sokratis Sofianopoulos, Marina Vassiliou: Accurate phrase alignment in a bilingual corpus for EBMT systems

Kateřina Veselovská, Ngãy Giang Linh, Michal Novák: Using Czech-English Parallel Corpora in Automatic Identification of It

Manuela Yapomo, Gloria Corpas, Ruslan Mitkov: CLIR - and Ontology-Based Approach for Bilingual Extraction of Comparable Documents

Amir Hazem, Emmanuel Morin: ICA for Bilingual Lexicon Extraction from Comparable Corpora

Hiroyuki Kaji, Takashi Tsunakawa, Yoshihoro Komatsubara: Improving Compositional Translation with Comparable Corpora

Nikola Ljubešić, Špela Vintar, Darja Fišer: Multi-word term extraction from comparable corpora by combining contextual and constituent clues

Robert Remus, Mathias Bank: Textual Characteristics of Different-sized Corpora

Closing of the workshop

| 2012-06-28 | To the top |

ACCURAT and LetsMT! projects organized a joint workshop Customized Machine Translation: Platform, Tools and Application: LetsMT! cloud platform and ACCURAT tools

The ACCURAT project jointly with LetsMT! project organized a workshop Customized Machine Translation: Platform, Tools and Application: LetsMT! cloud platform and ACCURAT tools as an accompanying event to the GALA2012 conference. It was held in Monte Carlo, 2012-03-25. The whole workshop was targeted for localisation and language technology professional users where we presented the latest development of ACCURAT tools and their application in the localisation scenario. Presenters were partners from both projects while the invited speaker was Achim Ruopp from Digital Silk Road. You can read more on the workshop web page. The whole workshop was video recorded so the presentations can be watched on-line:

Andrejs Vasiļjevs (Tilde): The Quest for Better MT

Achim Ruopp (Digital Silk Road): Modern Ubiquitous Machine Translation – Threat or Opportunity?

Indra Sāmīte (Tilde): LetsMT! and ACCURAT at Your Service

Marko Tadić (Univ. of Zagreb, FFZG): ACCURAT Toolkit = More Data

Raivis Skadiņš (Tilde): LetsMT! Do-it-Yourself Demo

Gregor Thurmair (Linguatec): Creating Lexicon Entries for Narrow Domains from Comparable Corpora

Mateja Verlič (Zemanta): Using SMT in the Blogging Environment

Andrejs Vasiļjevs (Tilde): Real World Evaluation of SMT in Localization

Panel discussion: Customized MT: Truth or Myth?

| 2012-04-12 | To the top |

ACCURAT project presented at the 2nd Slavic Corpora Conference, SlaviCorp2011

The ACCURAT project was presented by two papers at the Second Conference on Slavic Corpora (SlaviCorp2011) that was held from 2011-09-12 to 2011-09-14 in Dubrovnik, Croatia

The presentation by Željko Agić, Daša Berović, Danijela Merkler and Marko Tadić Development and Applications of the Croatian 1984 Corpus for the MULTEXT-East Resources was given by Željko Agić.























The second presentation by Nikola Ljubešić and Tomaž Erjavec hrWaC and slWac: Web Corpora for Croatian and Slovene was given by Nikola Ljubešić.















| 2010-10-12 | To the top |

ACCURAT project presented at the FLaReNet Forum 2011

In Venice, Italy, on 2011-05-26 and 2011-05-27 the FLaReNet Forum 2011 was organized by the FLaReNet project. The Forum was organised in six main sessions with up to five panelists and a couple of discussants from the audience.

The ACCURAT project was presented by Andrejs Vasiļjevs as a panelist within the Session S4 "Innovation needs data", with the presentation How to Get More Data for Under-resources Languages and Domains.




















| 2011-06-08 | To the top |

ACCURAT at the W3C Workshop: Content on the Multilingual Web

In Pisa, Italy, on 2011-04-04 and 2011-04-05 the W3C Workshop: Content on the Multilingual Web was organized. The ACCURAT project was presented on 2011-04-05 by Andrejs Vasiļjevs with the presentation Bridging technological gap between smaller and larger languages. The whole presentation can be seen at Videolectures pages.


















| 2011-06-08 | To the top |

ACCURAT project presented at the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages (FASSBL7)

The ACCURAT project was presented by three papers at the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages (FASSBL7) that was held from 2010-10-04 to 2010-10-06 in Dubrovnik, Croatia

The presentation by Radu Ion, Dan Tufiş, Tiberiu Boroş, Alexandru Ceauşu and Dan Ştefănescu On-line Compilation of Comparable Corpora and their Evaluation was given by Radu Ion.















The second presentation by Kristina Vučković, Željko Agić and Marko Tadić Sentence Classification and Clause Detection for Croatian was given by Željko Agić.















The third presentation by Krešimir Šojat, Željko Agić and Marko Tadić Verb Valency Frame Extraction Using Morphological and Syntactic Features of Croatian was given by Krešimir Šojat.














| 2010-10-12 | To the top |

ACCURAT project presented at the Workshop on Methods for the automatic acquisition of Language Resources and their evaluation methods

The ACCURAT project was presented by a lecture at the Workshop on on Methods for the automatic acquisition of Language Resources and their evaluation methods that was held on 2010-05-23 as one of satellite workshops at the Language Resources and Evaluation Conference (LREC2010) in Malta

The lecture ACCURAT: Metrics for the evaluation of comparability of multilingual corpora was given by Andrejs Vasiljevs.












| 2010-06-30 | To the top |

ACCURAT project presented at the 3rd Workshop on Building and Using Comparable Corpora (BUCC)

The ACCURAT project was presented by two lectures at the 3rd Workshop on Building and Using Comparable Corpora (BUCC) that was held on 2010-05-22 as one of satellite workshops at the Language Resources and Evaluation Conference (LREC2010) in Malta

The first lecture Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation was given by Inguna Skadiņa.













The second lecture Improving Machine Translation Performance Using Comparable Corpora was given by Andreas Eisele.












| 2010-06-30 | To the top |

Videolectures.NET is a free and open access educational video lectures repository. The lectures are given by distinguished scholars and scientists at the most important and prominent events like conferences, summer schools, workshops and science promotional events from many fields of Science. The portal is aimed at promoting science, exchanging ideas and fostering knowledge sharing by providing high quality didactic contents not only to a scientific community but also to a general public. All lectures, accompanying documents, information and links are systematically selected and classified through the editorial process taking into account also users' comments.

The training materials are being developed within the FP5, FP6, and FP7 European Framework Programs, where the web based portal VideoLectures.NET is being used as an educational platform for several EU funded research projects such as PASCAL NoE, ECOLEAD NoE, SEKT IP and different organizations among others MIT OpenCourseWare and CERN. The range of countries involved and languages used varies from Europe, USA, Taiwan, Australia, Ukraine, Russia and Brazil.

The portal is becoming a major reference video training material repository and dissemination channel for academic researchers all around the world. Following the ideas to network with other similar initiatives, new frameworks and plans are being prepared aiming at boosting up an e-science video reference network combining universities and research institutes that provides a qualitative stream of scientific and training programs.

Videolectures was founded in 2001 as an internally-funded project and is now run by the dedicated Center for Transfer in Information Technologies at the Josef Stefan Institute, Ljubljana, Slovenia.

ACCURAT project will try to record all situations where presentations and demonstrations of the project at different events, conferences, workshops, presence days etc. will be held by the members of the project. Our repository of video lectures will be also available at project web pages as well as at Videolectures.NET.

| 2010-03-04 |

0