About: Online recruitment industry holds large amount of user-generated content in the form of job postings, resumes etc. This content finds its way in the knowledge bases (KB) causing duplicate and non-standard representations of entities (like company names, institute names, designations, skills etc.) These non-standard entity representations impact various applications such as search, recommendations and information retrieval. Therefore, KB canonicalization i.e, mapping multiple references of same entities into unique clusters is imperative for online recruitment platforms. Research suggests various approaches that use enriched semantic context or external context (from sources like Freebase) to perform KB Canonicalization. In fields where such external sources of context do not exist the problem remains challenging. To address these challenges, we propose a novel deep Siamese architecture with character-based attention and word embeddings that (a) estimates pairwise similarity between all entity mentions, and (b) then uses these similarity (scores) to create canonical clusters representing unique entity in the KB. Our experiments on recruitment domain dataset comprising of 62,288 unique entities of various types such as companies, institutes, skills, and designations demonstrate the effectiveness of our approach. We also provide insights on different network architectures, each of which encapsulate a different set of variation while performing canonicalization.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Online recruitment industry holds large amount of user-generated content in the form of job postings, resumes etc. This content finds its way in the knowledge bases (KB) causing duplicate and non-standard representations of entities (like company names, institute names, designations, skills etc.) These non-standard entity representations impact various applications such as search, recommendations and information retrieval. Therefore, KB canonicalization i.e, mapping multiple references of same entities into unique clusters is imperative for online recruitment platforms. Research suggests various approaches that use enriched semantic context or external context (from sources like Freebase) to perform KB Canonicalization. In fields where such external sources of context do not exist the problem remains challenging. To address these challenges, we propose a novel deep Siamese architecture with character-based attention and word embeddings that (a) estimates pairwise similarity between all entity mentions, and (b) then uses these similarity (scores) to create canonical clusters representing unique entity in the KB. Our experiments on recruitment domain dataset comprising of 62,288 unique entities of various types such as companies, institutes, skills, and designations demonstrate the effectiveness of our approach. We also provide insights on different network architectures, each of which encapsulate a different set of variation while performing canonicalization. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	Online recruitment industry holds large amount of user-generated content in the form of job postings, resumes etc. This content finds its way in the knowledge bases (KB) causing duplicate and non-standard representations of entities (like company names, institute names, designations, skills etc.) These non-standard entity representations impact various applications such as search, recommendations and information retrieval. Therefore, KB canonicalization i.e, mapping multiple references of same entities into unique clusters is imperative for online recruitment platforms. Research suggests various approaches that use enriched semantic context or external context (from sources like Freebase) to perform KB Canonicalization. In fields where such external sources of context do not exist the problem remains challenging. To address these challenges, we propose a novel deep Siamese architecture with character-based attention and word embeddings that (a) estimates pairwise similarity between all entity mentions, and (b) then uses these similarity (scores) to create canonical clusters representing unique entity in the KB. Our experiments on recruitment domain dataset comprising of 62,288 unique entities of various types such as companies, institutes, skills, and designations demonstrate the effectiveness of our approach. We also provide insights on different network architectures, each of which encapsulate a different set of variation while performing canonicalization.
subject	Concepts in logic Information retrieval Natural language processing User interfaces Knowledge bases Computing terminology Language modeling Text user interface
part of	Canonicalizing Knowledge Bases for Recruitment Domain
is abstract of	Canonicalizing Knowledge Bases for Recruitment Domain
is hasSource of	covid:ann/target/6b230d8e9ba8118f492ea7ad4542c65f3cf89bd1 covid:ann/target/cacf7a8eb7d367485bd3cdff309cfe3c00fa055b covid:ann/target/abf26ee49bf6cc55f1fb010699bff00f9e6d7306 covid:ann/target/095b30dbaee6350cf9dd2dc11267daf5709c472f covid:ann/target/1a4c7328a72cb8c3b1de134bcdc6ca8125cee0a0 covid:ann/target/5b26a540010890de0734922cea8777eb52654de5 covid:ann/target/bfa3ed8a9c31460d318119ed16bd01e7b17e125c covid:ann/target/c392f3cb8049d20007771c039525c805f497323f covid:ann/target/b1146c9539d40c6a74d4c1844577e6d466a3b45a covid:ann/target/ccaf2b6e77197c8fb6f56a2014dd4f9fe289df4b covid:ann/target/e9bde0d69c1a5d6295c2065544b4d14c2e69d62d covid:ann/target/2dc0e21df7c20488165d60e948d4285c42aeba71 covid:ann/target/6a2cb975256aadcf58e3e9a18688ef2a964621ed covid:ann/target/8fef74898e51868978285663d66b54da99292efe covid:ann/target/eadb9d571ef8e76a5e49a29fe4c2e2ba903f8351 covid:ann/target/752e73a57d48143c8e9db5c30e022d35f829471d covid:ann/target/b5531182a41d88011ff18c7bf622e30a823de2e1 covid:ann/target/292745d9c3209b93411bab17e60eae4404589f08 covid:ann/target/05acb43adcc267ed95388e14af7f7d1cad457288 covid:ann/target/6887d952be49de334412c08787e3545305e7f9d6 covid:ann/target/6dae2c2c86c7173d69b8c65840c280276fb58c12 covid:ann/target/999d5b8cd632f43128be32a39b777c6d536d0786 covid:ann/target/2877c010f72e34ddbff7f580f1cb29ac2a2f4597 covid:ann/target/472f9446fb07efc529c95390d655da0ccbf4f4a1 covid:ann/target/67ac40c95cd1eac3d743c974e24b2fda0c7eecda covid:ann/target/613674b7d30c409a15361c0c8055350c43621c51 covid:ann/target/5bbcea9c1cb204105950e0f0c8f55b397ec47d9f covid:ann/target/93b0cc581a75ecee8bf9d5ad3333ae2706f34039 covid:ann/target/61e2c04f592972a60cdc4beb0ccbee58d060faf9 covid:ann/target/220cc9dd7eee0d905c1fdc0b4b5821bfe1f66db8 covid:ann/target/8250f2042baef1577f235ac2e061b6f94ef5f4ca covid:ann/target/33fdb5f95603fbd23839ac4fa86186c366890140 covid:ann/target/b9692b56f96b382403b6304e01d6b6729b31563d covid:ann/target/94d0f45f353e0e1c330a126bac313f08181e882b covid:ann/target/145d3b7d3cf9bc01d6e922941a2e0d5eac3995e9 covid:ann/target/2d6fe4a6488dd86651b53f23a87de709c0a470e6 covid:ann/target/42993995c2b77365b9e5f3cd53fe80482eeab18c covid:ann/target/54ba3df2a6fb65d15bb0960664672c6a42a17aee covid:ann/target/0c84ff48a72a274e14059d9152eed179c4a60e72 covid:ann/target/9b39e33e7750101b60414efc8a380e9af2c8b35f covid:ann/target/b36851ffe22eee3ef1d1160d501dcdcc74d6c33d covid:ann/target/b8e10954c8f415953c56837b8c0803a87e4e52f7 covid:ann/target/de04fc2e08d1a37595a650bdf7a0eba190215815 covid:ann/target/04ee3bdf6b1a39f3ebed68bdc65229ea48abdc94 covid:ann/target/0d39cf7cd2963c9e04834472add2e565e127b5ca covid:ann/target/36fb2d41e2eb879b1b9d303c8a46dc12d12057f5 covid:ann/target/886ae942ce5cc9ab00bf2f3a141f083c527e5e2c covid:ann/target/c1c4483e07655644e55c08dfb6b7aa9135d95d3a covid:ann/target/e2678aab54d90f6e69ded8511546be17ba907b72 covid:ann/target/a8cfaa61cea4fa9ec0626d920dd905305c0300d6 covid:ann/target/bd5eb626458c452f422a9eb5f3752c590beae4b9 covid:ann/target/8b536dbb7c9c1c5086754d146e09a91ec0aeb9f8 covid:ann/target/00de968146eb20785959c1424da08972fe627042 covid:ann/target/06c228300448dbde546600974468c181eecc18aa covid:ann/target/0da67697fb81320f43b87939ee478db699534035 covid:ann/target/6f6c2a5f928b8363af9077c8fece756ab7887b9a covid:ann/target/f2f34636b9b49d3f1a1b076d91a31e439551e10d covid:ann/target/1e47dd1d8f3b6579bac2885647f4fec082b924b1 covid:ann/target/8a2ebf69c1c30603bc6c9132b41c997dd45d6f16 covid:ann/target/82c1be13da92e5921ab1557832935771a59d9f11 covid:ann/target/09c36fac5bab7e60ff145d01bb8d4c45f836f42c covid:ann/target/1c397706b043ab83d05392b727a237572fc8ef2d covid:ann/target/36271fbaf8d7bb2bc3a9b3df445e7999571cb243 covid:ann/target/4bbb6887f690570fcc974dd7164f1191668125e5 covid:ann/target/92af734c3167f3e4a24310c098b94603349ec7c9 covid:ann/target/95eb65d1358d16cc88c3d7bdd344fa81f7e1fc50 covid:ann/target/b03c74dd1bdaae6e2707d1a73180c56d19bd2f93 covid:ann/target/bfde499cd0b694d9d472b95825c6711e34646409 covid:ann/target/ce6bd02145b5a3641222c358832dd0128dc21ebf covid:ann/target/17d7a60b8ffaef4d63138c4bd5222266fc4e686a covid:ann/target/fbb33b9e9379129960bdf8006f1d2e00436faf51 covid:ann/target/b044920a6641d4fb409eeb051a17526bd510d4d6 covid:ann/target/dbdeb8b63fa60cca648f8ca11a61609a3ac8c5cd covid:ann/target/27decfd6c70f053b43fdf110aef390e4baf13e8f covid:ann/target/92a8a0d1e7b6b3b539c6d741e36afaa1f8184ef8 covid:ann/target/2d83eb248bde5cf18a7564bb1d0c688382bb9144 covid:ann/target/22d753951041cbf0c3776108a29fb37a425fdd1c covid:ann/target/733bb622db88d6faeed18bd11e64b5a3cc580e18 covid:ann/target/aa44264a7c3f74181fef768cadef01cd965ffd87 covid:ann/target/b3ee929edbaf4f5a36cb2696e03d8ee96b8a004c covid:ann/target/8a3943f67471dd2de421c0fdbe418a4cc97cc6c9 covid:ann/target/ff184cad2699cd878adf5974a1d24e47917a0a06 covid:ann/target/6f8682c8df53d9564a9913c019c944af11813272 covid:ann/target/4f401b3a6e104889957fe4358315d8c6a2b9ee28 covid:ann/target/62a2b5e1274f76241cfe3c6d1781edca302e047f covid:ann/target/f5a82d73c0dc5ff8c94b0061b73971f7b4a90be6 covid:ann/target/b3fbf0ae3229da0787e7af4047bfc3bf6984990c covid:ann/target/a635eb190ea67e4c4bbb9fe135b9f9ee99a5e232

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software