released:: 05/20
excerpt::
dataset/bio/description::
dataset/bio/comment::
By typing entities into a semantic type you can reduce candidate labels by 10x. Improves entity linking. Reported 6-10 points in AUC increase for entity disambiguation
Disambiguation methods can both be used for NER and Entity Linking
UMLS has 127 types. They grouped them into 24 semantic groups (see Table 2)
Authors create WikiMed (650k mentions normalized to UMLS) and PubMedDS dataset]] and pre-train MedType with it before fine-tuning on medical data
Open deep learning information extraction generates a lot of entity candidate
Context Encoder: feeds [mention + context window (neighbor tokens), semantic type]
Semantic Types that come with UMLS the set of candidate labels can be reduce by an order of magnitute
excerpt::
Heuristic Algorithm: Intuition: New terms (eg. Self-Taught Hashing, gradient penalty) are introduced by a single paper that gets cited a lot later. Problem before paper: phrases extracted often too general or too narrow.
use-for:: semi-automated construction of wiki pages for science
AllenAI's ForeCite achieves a double digit precision improvement on extracting new concepts from papers
Scientific Concept Extraction can be used for trend analysis, knowledge base construction or semi-automated encyclopedia construction for science