<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>wikifier Archives - Turbolab Technologies</title>
	<atom:link href="https://turbolab.in/tag/wikifier/feed/" rel="self" type="application/rss+xml" />
	<link>https://turbolab.in/tag/wikifier/</link>
	<description>Big Data and News Analysis Startup in Kochi</description>
	<lastBuildDate>Fri, 05 Aug 2022 14:34:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://i0.wp.com/turbolab.in/wp-content/uploads/2018/03/turbo_black_trans-space.png?fit=32%2C32&#038;ssl=1</url>
	<title>wikifier Archives - Turbolab Technologies</title>
	<link>https://turbolab.in/tag/wikifier/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">98237731</site>	<item>
		<title>Entity Linking &#038; Disambiguation using REL</title>
		<link>https://turbolab.in/entity-linking-disambiguation-using-rel/</link>
					<comments>https://turbolab.in/entity-linking-disambiguation-using-rel/#respond</comments>
		
		<dc:creator><![CDATA[Vasista Reddy]]></dc:creator>
		<pubDate>Tue, 12 Jul 2022 07:02:27 +0000</pubDate>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[entity linking]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[nltk]]></category>
		<category><![CDATA[rel]]></category>
		<category><![CDATA[spacy]]></category>
		<category><![CDATA[wikifier]]></category>
		<category><![CDATA[wikipedia]]></category>
		<guid isPermaLink="false">https://turbolab.in/?p=907</guid>

					<description><![CDATA[<p>Entity extraction, also known as Named Entity Recognition(NER), is an information extraction process that extracts entities from unstructured text and then classifies them into predefined categories such as people, organizations, places, products, date, time, money, phone numbers and so on. The several terabytes of unstructured text data, that comes from documents, web pages, and social [&#8230;]</p>
<p>The post <a href="https://turbolab.in/entity-linking-disambiguation-using-rel/">Entity Linking &amp; Disambiguation using REL</a> appeared first on <a href="https://turbolab.in">Turbolab Technologies</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400">Entity extraction, also known as </span><em><b>Named Entity Recognition(NER)</b></em><span style="font-weight: 400">, is an information extraction process that extracts entities from unstructured text and then classifies them into predefined categories such as people, organizations, places, products, date, time, money, phone numbers and so on. The several terabytes of unstructured text data, that comes from documents, web pages, and social media, will be transformed into structured entities that help analysts query the data and generate insightful reports.</span></p>
<p><span style="font-weight: 400">spaCy provides different models in various languages to perform NER and NLP-related tasks. Building a custom NER model using spaCy has been explained in one of our blogs. You can check out the link</span> <strong><a href="https://turbolab.in/build-a-custom-ner-model-using-spacy-3-0/">here</a></strong>.</p>
<p><span style="font-weight: 400">Now, let’s look into the entity extraction from a random news article using spaCy and Flair:</span></p>
<blockquote><p><em>Defending champion Novak Djokovic battled back from two sets to love down to defeat Jannik Sinner and reach his 11th Wimbledon semi-final on Tuesday. Djokovic triumphed 5-7, 2-6, 6-3, 6-2, 6-2 and will face Britain&#8217;s Cameron Norrie of Belgium for a place in Sunday&#8217;s final. It was the seventh time in the Serb&#8217;s career that he had recovered from two sets to love at the Slams. &#8220;Huge congrats to Jannik for a big fight, he&#8217;s so mature for his age, he has plenty of time ahead of him,&#8221; said Djokovic.</em></p></blockquote>
<h5>Entity Extraction using spaCy:</h5>
<blockquote><p><em><strong>import spacy</strong></em></p>
<p><em><strong>nlp = spacy.load(&#8216;en_core_web_lg&#8217;) # spacy load the model</strong></em></p>
<p><em><strong>ner_ent = {&#8216;person&#8217;: [], &#8216;norp&#8217;: [], &#8216;fac&#8217;: [], &#8216;org&#8217;: [], &#8216;gpe&#8217;: [], &#8216;loc&#8217;: [], &#8216;product&#8217;: [], &#8216;event&#8217;: [], &#8216;work_of_art&#8217;: [], &#8216;law&#8217;: [], &#8216;language&#8217;: [], &#8216;date&#8217;: [], &#8216;time&#8217;: [], &#8216;percent&#8217;: [], &#8216;money&#8217;: [], &#8216;quantity&#8217;: [], &#8216;ordinal&#8217;: [], &#8216;cardinal&#8217;: []}</strong></em></p>
<p><em><strong>doc = nlp(content)</strong></em><br />
<em><strong>for entity in doc.ents:</strong></em><br />
<em><strong>    if entity.label_.lower() in ner_ent:</strong></em><br />
<em><strong>        ner_ent[entity.label_.lower()].append(entity.text)</strong></em></p>
<p><em><strong>print(ner_ent)</strong></em></p>
<p><em><strong># output</strong></em></p>
<p><em><strong>{&#8216;person&#8217;: [&#8216;Novak Djokovic&#8217;, &#8216;Jannik Sinner&#8217;, &#8216;Cameron Norrie&#8217;, &#8216;Jannik&#8217;, &#8216;Djokovic&#8217;, &#8216;Novak Djokovic&#8217;, &#8216;Jannik Sinner&#8217;, &#8216;Cameron Norrie&#8217;, &#8216;Jannik&#8217;, &#8216;Djokovic&#8217;], &#8216;norp&#8217;: [&#8216;Serb&#8217;, &#8216;Serb&#8217;], &#8216;fac&#8217;: [], &#8216;org&#8217;: [], &#8216;gpe&#8217;: [&#8216;Britain&#8217;, &#8216;Belgium&#8217;, &#8216;Britain&#8217;, &#8216;Belgium&#8217;], &#8216;loc&#8217;: [], &#8216;product&#8217;: [], &#8216;event&#8217;: [&#8216;Wimbledon&#8217;, &#8216;Wimbledon&#8217;], &#8216;work_of_art&#8217;: [], &#8216;law&#8217;: [], &#8216;language&#8217;: [], &#8216;date&#8217;: [&#8216;Tuesday&#8217;, &#8216;Sunday&#8217;, &#8216;Tuesday&#8217;, &#8216;Sunday&#8217;], &#8216;time&#8217;: [], &#8216;percent&#8217;: [], &#8216;money&#8217;: [], &#8216;quantity&#8217;: [], &#8216;ordinal&#8217;: [&#8217;11th&#8217;, &#8216;seventh&#8217;, &#8217;11th&#8217;, &#8216;seventh&#8217;], &#8216;cardinal&#8217;: [&#8216;two&#8217;, &#8216;5&#8217;, &#8216;2-6&#8217;, &#8216;6-3&#8217;, &#8216;6&#8217;, &#8216;6-2&#8217;, &#8216;two&#8217;, &#8216;two&#8217;, &#8216;5&#8217;, &#8216;2-6&#8217;, &#8216;6-3&#8217;, &#8216;6&#8217;, &#8216;6-2&#8217;, &#8216;two&#8217;]}</strong></em></p></blockquote>
<h5>Entity Extraction using Flair:</h5>
<blockquote><p><em><strong>from flair.data import Sentence</strong></em><br />
<em><strong>from flair.models import SequenceTagger</strong></em></p>
<p><em><strong>ner_ent = {&#8216;per&#8217;: [], &#8216;org&#8217;: [], &#8216;loc&#8217;: [], &#8216;misc&#8217;: []}</strong></em></p>
<p><em><strong># make a sentence</strong></em><br />
<em><strong>sentence = Sentence(content)</strong></em></p>
<p><em><strong># load the NER tagger</strong></em><br />
<em><strong>tagger = SequenceTagger.load(&#8216;ner&#8217;)</strong></em></p>
<p><em><strong># run NER over sentence</strong></em><br />
<em><strong>tagger.predict(sentence)</strong></em></p>
<p><em><strong>print(&#8216;The following NER tags are found:&#8217;)</strong></em><br />
<em><strong># iterate over each entity</strong></em><br />
<em><strong>for entity in sentence.get_spans(&#8216;ner&#8217;):</strong></em><br />
<em><strong>    if str(entity.labels[0]).split()[0].lower() in ner_ent:</strong></em><br />
<em><strong>        ner_ent[str(entity.labels[0]).split()[0].lower()].append(entity.text)</strong></em></p>
<p><em><strong># output</strong></em></p>
<p><em><strong>The following NER tags are found:</strong></em></p>
<p><em><strong>{&#8216;per&#8217;: [&#8216;George Washington&#8217;, &#8216;Novak Djokovic&#8217;, &#8216;Jannik Sinner&#8217;, &#8216;Djokovic&#8217;, &#8216;Cameron Norrie&#8217;, &#8216;Jannik&#8217;, &#8216;Djokovic&#8217;], &#8216;org&#8217;: [], &#8216;loc&#8217;: [&#8216;Washington&#8217;, &#8216;Britain&#8217;, &#8216;Belgium&#8217;], &#8216;misc&#8217;: [&#8216;Wimbledon&#8217;, &#8216;Serb&#8217;, &#8216;Slams&#8217;]}</strong></em></p></blockquote>
<p>Flair NER models give us only 4 entity types whereas spaCy gives 18 entity types.</p>
<h2>Entity Linking &amp; Disambiguation</h2>
<p>Entity Linking is the process of linking entities with the target knowledge base. Here, we map the entities to the wiki links or the wiki page titles. Hence the process is called Wikification. We can say entity linking is also referred to as entity validation. The entities extracted from the models of Spacy or Flair will get validated from the third-party knowledge base.</p>
<p>However, this job is entity linking is intricate due to entity ambiguity and name variants. For example, the word <strong>Amazon</strong> refers to an organization and a rainforest.</p>
<p>Let&#8217;s have a detailed discussion on Entity Linking &amp; Entity Disambiguation</p>
<h5>News Article Clip:</h5>
<blockquote><p>Deforestation in Brazil&#8217;s Amazon rainforest reached a record high for the first six months of the year, as an area five times the size of New York City was destroyed, preliminary government data showed on Friday.</p></blockquote>
<h5>Spacy Output:</h5>
<blockquote><p>&#8216;org&#8217;: [&#8216;Amazon&#8217;], &#8216;gpe&#8217;: [&#8216;Brazil&#8217;, &#8216;New York City&#8217;]</p></blockquote>
<p>Here, <strong>Amazon</strong> is detected as the organization.</p>
<h5>Flair Output:</h5>
<blockquote><p>&#8216;loc&#8217;: [&#8216;Brazil&#8217;, &#8216;Amazon&#8217;, &#8216;New York City&#8217;]</p></blockquote>
<p><span style="font-weight: 400">Here, </span><b>Amazon</b><span style="font-weight: 400"> is detected as the location/GPE. The ambiguity problem is clearly visible here and can be solved by Radboud Entity Linker (REL).</span></p>
<h5><strong>REL</strong> <strong>Output</strong>:</h5>
<p><img data-recalc-dims="1" fetchpriority="high" decoding="async" data-attachment-id="908" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/rel/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?fit=1430%2C266&amp;ssl=1" data-orig-size="1430,266" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="rel" data-image-description="" data-image-caption="&lt;p&gt;REL&lt;/p&gt;
" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?fit=800%2C148&amp;ssl=1" class="size-full wp-image-908" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=800%2C149&#038;ssl=1" alt="" width="800" height="149" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?w=1430&amp;ssl=1 1430w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=300%2C56&amp;ssl=1 300w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=768%2C143&amp;ssl=1 768w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=1024%2C190&amp;ssl=1 1024w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=1080%2C201&amp;ssl=1 1080w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=1280%2C238&amp;ssl=1 1280w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=980%2C182&amp;ssl=1 980w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel.png?resize=480%2C89&amp;ssl=1 480w" sizes="(max-width: 800px) 100vw, 800px" /></p>
<p><a href="https://github.com/informagi/REL"><strong>Radboud Entity Linker (REL)</strong></a> deals <span style="font-weight: 400">with the tasks of Entity Linking and Entity Disambiguation. One can use the public API provided by REL or install it using Docker/Source code with the instructions mentioned in the documentation. By default, </span><b>REL</b><span style="font-weight: 400"> uses Flair to extract entities; you can replace Flair with spaCy. REL also provides pre-trained models with case-sensitive and insensitive models with an f1 score of almost 93%.</span></p>
<p><a href="https://pypi.org/project/wikimapper/"><strong>Wikimapper</strong></a> python <span style="font-weight: 400">library is used to fetch the wikidata_id from the Wikipedia titles. You can have a look at the project which helps you to map Wikipedia page titles to WikiData IDs and vice-versa.</span></p>
<p><a href="https://github.com/facebookresearch/BLINK"><b>BLINK</b></a><span style="font-weight: 400">, the Facebook research entity linking python library,  uses Wikipedia as the target knowledge base, similar to </span><b>REL</b><span style="font-weight: 400">. But, the BLINK documentation hasn&#8217;t revealed any information regarding entity disambiguation.</span></p>
<p><a href="https://github.com/wetneb/opentapioca"><b>OpenTapioca</b></a><span style="font-weight: 400"> is a simple and fast Named Entity Linking system for Wikidata. A spaCy wrapper of OpenTapioca called</span><a href="https://spacy.io/universe/project/spacyopentapioca"> <b>spaCyOpenTapioca</b></a><span style="font-weight: 400"> is also available for the entity linking process. But the results are not as great when compared to REL.</span></p>
<p><span style="font-weight: 400">spaCy includes a pipeline component called</span><a href="https://spacy.io/api/entitylinker"> <b>entitylinker</b></a><span style="font-weight: 400"> for Named Entity Linking and Disambiguation.</span></p>
<h2>Dealing with Disambiguation</h2>
<blockquote><p><span id="w0" class="word annotHilite hasAnnotation underlined">Japan</span><span id="s1" class="space"> </span><span id="w1" class="word hasAnnotation">began</span><span id="s2" class="space"> </span><span id="w2" class="word hasAnnotation">the</span><span id="s3" class="space hasAnnotation"> </span><span id="w3" class="word hasAnnotation">defence</span><span id="s4" class="space hasAnnotation"> </span><span id="w4" class="word hasAnnotation">of</span><span id="s5" class="space"> </span><span id="w5" class="word hasAnnotation">their</span><span id="s6" class="space hasAnnotation"> </span><span id="w6" class="word hasAnnotation">title</span><span id="s7" class="space"> </span><span id="w7" class="word hasAnnotation">with</span><span id="s8" class="space"> </span><span id="w8" class="word hasAnnotation">a</span><span id="s9" class="space"> </span><span id="w9" class="word hasAnnotation">lucky</span><span id="s10" class="space"> </span><span id="w10" class="word hasAnnotation">2-1</span><span id="s11" class="space"> </span><span id="w11" class="word hasAnnotation">win</span><span id="s12" class="space"> </span><span id="w12" class="word hasAnnotation">against</span><span id="s13" class="space"> </span><span id="w13" class="word hasAnnotation underlined">Syria</span><span id="s14" class="space"> </span><span id="w14" class="word hasAnnotation">in</span><span id="s15" class="space"> </span><span id="w15" class="word hasAnnotation">a</span><span id="s16" class="space hasAnnotation"> </span><span id="w16" class="word hasAnnotation">championship</span><span id="s17" class="space hasAnnotation"> </span><span id="w17" class="word hasAnnotation">match</span><span id="s18" class="space"> </span><span id="w18" class="word hasAnnotation">on</span><span id="s19" class="space"> </span><span id="w19" class="word hasAnnotation">Friday</span><span id="s20" class="space"></span><span id="w20" class="word hasAnnotation">.</span></p></blockquote>
<p><span style="font-weight: 400">Using the above statement, we will discuss the different approaches to choosing the appropriate entity in the case of Entity Disambiguation.</span></p>
<h5>Let&#8217;s see how <a href="https://wikifier.org/"><strong>wikifier</strong></a> deals with the disambiguation:</h5>
<p><a href="https://wikifier.org/"><strong>Wikifier</strong></a> <span style="font-weight: 400">doesn&#8217;t use any entity extraction method for extracting entities; it goes with Parts of Speech (POS).</span></p>
<p><img data-recalc-dims="1" decoding="async" data-attachment-id="911" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/wikifier1/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?fit=1891%2C381&amp;ssl=1" data-orig-size="1891,381" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="wikifier1" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?fit=800%2C161&amp;ssl=1" class="size-full wp-image-911 aligncenter" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=800%2C161&#038;ssl=1" alt="" width="800" height="161" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?w=1891&amp;ssl=1 1891w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=300%2C60&amp;ssl=1 300w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=768%2C155&amp;ssl=1 768w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=1024%2C206&amp;ssl=1 1024w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=1080%2C218&amp;ssl=1 1080w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=1280%2C258&amp;ssl=1 1280w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=980%2C197&amp;ssl=1 980w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?resize=480%2C97&amp;ssl=1 480w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier1.png?w=1600&amp;ssl=1 1600w" sizes="(max-width: 800px) 100vw, 800px" /></p>
<p><span style="font-weight: 400">The entities Syria and Japan are linked to their respective countries’ Wikipedia pages,</span><a href="https://en.wikipedia.org/wiki/Syria"> <b>Syria</b></a><span style="font-weight: 400"> and</span><a href="https://en.wikipedia.org/wiki/Japan"> <b>Japan</b></a><span style="font-weight: 400">. In the context of the above statement, Japan and Syria actually refer to their football teams. Wikifier fetches all the Wikipedia page entities related to the entity and maps the entity with the most link targets.</span></p>
<p><img data-recalc-dims="1" decoding="async" data-attachment-id="912" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/wikifier2/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?fit=483%2C671&amp;ssl=1" data-orig-size="483,671" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="wikifier2" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?fit=483%2C671&amp;ssl=1" class="size-full wp-image-912 aligncenter" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?resize=483%2C671&#038;ssl=1" alt="" width="483" height="671" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?w=483&amp;ssl=1 483w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?resize=216%2C300&amp;ssl=1 216w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/wikifier2.png?resize=480%2C667&amp;ssl=1 480w" sizes="(max-width: 483px) 100vw, 483px" /></p>
<p>Wikifier considers the minLinkFrequency parameter to evaluate the score.</p>
<h5>Let&#8217;s see how REL deals with the disambiguation:</h5>
<p>In REL, entity linking decisions depend on the contextual similarity and coherence with the other entity linking decisions in the document. One entity mapping is dependent on the other entities found in the document. You can read the paper <a href="https://arxiv.org/pdf/2006.01969.pdf"><strong>here</strong></a>.</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="913" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/rel2/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?fit=1435%2C215&amp;ssl=1" data-orig-size="1435,215" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="rel2" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?fit=800%2C120&amp;ssl=1" class="size-full wp-image-913 aligncenter" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=800%2C120&#038;ssl=1" alt="" width="800" height="120" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?w=1435&amp;ssl=1 1435w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=300%2C45&amp;ssl=1 300w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=768%2C115&amp;ssl=1 768w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=1024%2C153&amp;ssl=1 1024w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=1080%2C162&amp;ssl=1 1080w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=1280%2C192&amp;ssl=1 1280w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=980%2C147&amp;ssl=1 980w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel2.png?resize=480%2C72&amp;ssl=1 480w" sizes="(max-width: 800px) 100vw, 800px" /></p>
<p><span style="font-weight: 400">This example doesn&#8217;t have any impact since only two entities are found and the content is a one-liner. Instead of the entity detection method, if we had passed the POS output, the result might have been different.</span></p>
<p>With passing the entire <a href="https://www.firstpost.com/sports/fifa-world-cup-qualifiers-2022-syria-japan-secure-victories-to-make-it-to-next-round-9694971.html"><strong>article</strong></a> to the REL, the results are quite better. The REL model can now understand the context and relate more entities from the entire article.</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="914" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/rel3/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?fit=1135%2C300&amp;ssl=1" data-orig-size="1135,300" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="rel3" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?fit=800%2C212&amp;ssl=1" class="size-full wp-image-914 aligncenter" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=800%2C211&#038;ssl=1" alt="" width="800" height="211" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?w=1135&amp;ssl=1 1135w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=300%2C79&amp;ssl=1 300w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=768%2C203&amp;ssl=1 768w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=1024%2C271&amp;ssl=1 1024w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=1080%2C285&amp;ssl=1 1080w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=980%2C259&amp;ssl=1 980w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/rel3.png?resize=480%2C127&amp;ssl=1 480w" sizes="(max-width: 800px) 100vw, 800px" /></p>
<p><strong>Brazil</strong> and <strong>Dutch</strong> mapped to their respective football team wiki pages. Mapping <strong>Japan</strong> to its respective football team is still a mystery though. LOL.</p>
<h2>Conclusion</h2>
<p><span style="font-weight: 400">Instead of going with the score of the most link targets, REL considers the context and the relationship between the entities detected from the document. By improving the mentioned detection, REL can be used as a perfect Entity Disambiguation tool.</span></p>
<p>Last but not least, there is a tool called <a href="https://github.com/SapienzaNLP/extend"><strong>ExtEnD</strong></a>(Extractive Entity Disambiguation) which needs to explore. We can add this tool to the spaCy NLP pipeline.</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="915" data-permalink="https://turbolab.in/entity-linking-disambiguation-using-rel/extend/" data-orig-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?fit=665%2C178&amp;ssl=1" data-orig-size="665,178" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="extend" data-image-description="" data-image-caption="" data-large-file="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?fit=665%2C178&amp;ssl=1" class="size-full wp-image-915 aligncenter" src="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?resize=665%2C178&#038;ssl=1" alt="" width="665" height="178" srcset="https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?w=665&amp;ssl=1 665w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?resize=300%2C80&amp;ssl=1 300w, https://i0.wp.com/turbolab.in/wp-content/uploads/2022/07/extend.png?resize=480%2C128&amp;ssl=1 480w" sizes="(max-width: 665px) 100vw, 665px" /></p>
<p>The output documented by <strong>ExtEnD</strong> is much better compared to the REL-generated output. Before coming to conclusion, as mentioned above this tool needs to explore.</p>
<p>The post <a href="https://turbolab.in/entity-linking-disambiguation-using-rel/">Entity Linking &amp; Disambiguation using REL</a> appeared first on <a href="https://turbolab.in">Turbolab Technologies</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://turbolab.in/entity-linking-disambiguation-using-rel/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">907</post-id>	</item>
	</channel>
</rss>
