About: In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues. This approach is novel in a way that combines methods of different types and adds edit distance based corrections. We used a large public dataset containing raw diacritized Arabic text (Tashkeela) for training and testing our system after cleaning and normalizing it. On a newly-released benchmark test set, our system outperformed all the tested systems by achieving DER of 3.39% and WER of 9.94% when taking all Arabic letters into account, DER of 2.61% and WER of 5.83% when ignoring the diacritization of the last letter of every word.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues. This approach is novel in a way that combines methods of different types and adds edit distance based corrections. We used a large public dataset containing raw diacritized Arabic text (Tashkeela) for training and testing our system after cleaning and normalizing it. On a newly-released benchmark test set, our system outperformed all the tested systems by achieving DER of 3.39% and WER of 9.94% when taking all Arabic letters into account, DER of 2.61% and WER of 5.83% when ignoring the diacritization of the last letter of every word. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues. This approach is novel in a way that combines methods of different types and adds edit distance based corrections. We used a large public dataset containing raw diacritized Arabic text (Tashkeela) for training and testing our system after cleaning and normalizing it. On a newly-released benchmark test set, our system outperformed all the tested systems by achieving DER of 3.39% and WER of 9.94% when taking all Arabic letters into account, DER of 2.61% and WER of 5.83% when ignoring the diacritization of the last letter of every word.
subject	Deep learning Emerging technologies Artificial intelligence Arabic words and phrases Phonetic guides Punctuation Similarity and distance measures Artificial neural networks Typography String similarity measures Diacritics Arabic diacritics Quranic orthography
part of	Multi-components System for Automatic Arabic Diacritization
is abstract of	Multi-components System for Automatic Arabic Diacritization
is hasSource of	covid:ann/target/1d0258d6bc0cc08ce7f018e3981c310bbdd7c23d covid:ann/target/736cd77c302784f10e3d6ca8a22a65970da40933 covid:ann/target/b1b5362d3674d64a68dc55c37d7978a5a9d10a93 covid:ann/target/a49b70ecba5d81ae9514e285c01ae8eea2c171fb covid:ann/target/32cb4a22b93bee4ceb17f38f9e696439abb0f748 covid:ann/target/7cf05d65ec4b8fa957c537abcaf60c8534956d9b covid:ann/target/14a41662f73d96d519b1e6fbe4305b614ff3781e covid:ann/target/4c4f6c4ad74f90abfc2b84971d91e54ac2ea5c74 covid:ann/target/a182dbd7b40a53a730cc76f3019d78d3bf1898db covid:ann/target/4f922c40509ff56256ddb8dc8b2761dc84453b94 covid:ann/target/2d662e52f742e58a9480c49649b382d9c0e8b94c covid:ann/target/7004177dbda2969d057523e490f307bea42e6447 covid:ann/target/bd2f18213f425c042b08b9a0482829f2b5f4f49a covid:ann/target/746e0bd52d0fd30ff55c082440ae6e0c7efb8a52 covid:ann/target/a81d6e0d5634ad656a6fe794ff09f9ddacc0a2eb covid:ann/target/4aff571025724ffebf1494f034a27b3081a1fe5e covid:ann/target/487d3d3782da3de003cbf72c2ec424c875bd5270 covid:ann/target/94de546169cd4fe47bfeffb0948c377e57d17943 covid:ann/target/06e58e38a539585627cbd253cea7b351ac9607d2 covid:ann/target/2fa5a74533937c99a15fc38e8853cce698f325ad covid:ann/target/6cb58945465f0fae5216233c003db3e5addca3e5 covid:ann/target/79f0da5e0b4445eed6a137045193f4bb6d98172c covid:ann/target/bbb7ed49b7a52aa59e032094c88e9bae89babe1d covid:ann/target/ca41ebedb96726d0c0a76342ab85d4eed8b84d3a covid:ann/target/159525ff226fb44523753dda3febb9cfb6cb9cdf covid:ann/target/ccfd7449d72bb5460b9ce033e4e76e851d1937b3 covid:ann/target/126ee2a0a085d3be79264a4fce82993412f99fa7 covid:ann/target/a0b28249ee878bd9407bef8df19288f0dc32837b covid:ann/target/e31e846d435ae18d7ad75b0cbfea4e660a4eacb0 covid:ann/target/d3986652f4c5b7fe119b37170fb7d9a533154a7d covid:ann/target/47fd768592aba19ab7d39a59c15f2c408f071d25 covid:ann/target/c52e27f2427228b92b7e63a9d27c6c112fc27e31 covid:ann/target/ebf56721d553c949e4bde21d2995b4653c2fb691 covid:ann/target/972b1d72bef70e93b647a9d312babe7b69c36140 covid:ann/target/affc5a3b9008cc334fc7dc7921eb2c739304964e covid:ann/target/6fd7d2d43ab147cda265efa76377aa18f4a7e887 covid:ann/target/73f43ac07d70cde9b3b4d95264b4ada7e224bc3b covid:ann/target/4e913312ff2783d150b58b5459b27ca0f3cf7a73 covid:ann/target/775c6592c9d04bb92785f194f94bd48e53baf223 covid:ann/target/9277c7efed0066e4665bb05f183ce9d26a4a3140 covid:ann/target/b205edb9e41de75897c7d7368bd0af7c76472764 covid:ann/target/c73fc23956ebe50147eca84d23f85e11de78d009 covid:ann/target/4821b62c964bfcadf3473953179653b89bd057d0 covid:ann/target/1a27d00621fa79ce378db82744b8fcad022b1690 covid:ann/target/1dde03e6e5ca3967548a7e068dd4c4f7c97fa2b9 covid:ann/target/c28944bc5bdfb7da47d45488d2693062a1b36d80 covid:ann/target/3a7cdc114658091fbdf3d10a2a6160e3af1949ca covid:ann/target/73c08572acf84a5688bf55ff2fe8deabec211913 covid:ann/target/73cee1f41ab0eeee21a8ba3f6ad61165c668457c covid:ann/target/7e7e5b318404ab75d7d7eacf500e80ef6b2952b9 covid:ann/target/0eb1c5a6ba716731211d7619fac4ed4a5f958ade covid:ann/target/270bc41febb84816257398d00c5ba8726e201bf8 covid:ann/target/7dd199e9c29dc3d4cafb037069607113db70cbbd covid:ann/target/d35b29672b25dc5b5e35a1093cc3ffb47a484bc7 covid:ann/target/968360bd4a90e7446d15ee05234e0e73bb0c5b25 covid:ann/target/f5a56c2364783ee4c13c3f378747e587503af9ef covid:ann/target/145acca192729a9ad874d99cce62f5b4f276ab55 covid:ann/target/5a6fa51d819f01a373894e6e81c7caa4dd0baa15 covid:ann/target/5b14a94e1ecbed2e3c176f0c80920b43a1658d46 covid:ann/target/a33b9bf0bcc8f74103a83edc34174dafeeebb993 covid:ann/target/c5ea4f7759e97d7794babea907eeac0607dc2d47 covid:ann/target/cf0d5815011319c07d6234b0083cd3e559fbfcaf covid:ann/target/7f35604fbfc5eef67842d9a2b8a6c1e47da311d6 covid:ann/target/2cf83c84624640b0260176c663a5fd33e5f3f22b covid:ann/target/462a1e741279c91dd1974571c2b53e570da21972 covid:ann/target/92bed61b143c14df1994294d207f63dbe3926b12 covid:ann/target/140a5ecf38fdfdbc24cee1dc4185d315794be999 covid:ann/target/329a9ec3e42939f6c7f73ff12acc5f29a53dcf83 covid:ann/target/57172cb6545bb6bdbb1324aeee70f70f69dfdf4e covid:ann/target/8c9408e89baca270b606d297ff5685f87f9a0a0b covid:ann/target/ee7c6d925750859ea036598471d19794bed638dc covid:ann/target/a3247bfde6d68e1370a864315e26a66e4957ec04 covid:ann/target/90059207256c7a4de297836c9d5986c2c70bbe62

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software