SOME IMPORTANT FEATURES OF SPACY IN NLP šŸ¤“

Saileshvuppala
4 min readMar 8, 2020

Regardless of whether youā€™re new to spaCy, or simply need to catch up on some NLP essentials and usage detailsā€” this blog ought to have you secured. Each segment will clarify one of spaCyā€™s highlights in straightforward terms and with models. A few areas will likewise return over the utilization controls as a quick introduction.

WHATS SPACY?šŸ¤”

Basically its a free,open source library for advanced natural language processing in python.

For Example ,Take a simple context ā€¦If you are working on a large text you will be having larger number of doubts like whats the text is about?what do those words speak and what do they mean?which words (or) text are similar?So you will be a ā€™nā€™ number of doubts.

spaCy is planned explicitly for production use and causes you construct applications that procedure and ā€œseeā€ large volumes of content.We can also use spacy for information extraction and also to pre process text in deep learning.

WHATā€™S SPACY ISNā€™T?

  1. It is not a platform.
  2. Not a Research Software.
  3. Not a company.

FEATURES

  1. TOKENIZATION.
  2. POS(PART OF SPEECH TAGGING).
  3. DEPENDENCY PARSING.
  4. LEMMATIZATION.
  5. SENTENCE BOUNDARY DETECTION.
  6. NAMED ENTITY RECOGNITION.
  7. SIMILARITY.
  8. TEXT CLASSIFICATION.
  9. TRAINING.
  10. SERIALIZATION.

TOKENIZATION:

Segmenting text into words.

PARTS OF SPEECH TAGGING(POS):

Basically POS is marking each word as a Verb,Noun.

PARTS OF SPEECH TAGGING(POS):

Assigning syntactic reliance marks, portraying the relations between singular tokens, similar to subject or object.

LEMMATIZATION:

Giving the base form of words.Example for rats the base form of wordrats is ā€˜ratā€™.

NAME ENTITY RECOGNITION:

In a given sentence we are labelling words into real world objects like person, companies,location,organisation etc.

SIMILARITY:

Uses:

  • Recommendation systems
  • Data preprocessing.

Basically it checks the similarity between two words .If both the words are similar it will give output as 1 stating that its is 100% .In the first figure it is giving you 0.97% because both the words in both the sentences are similar except the last word ,where as in second figure it is giving output as 1 showing that both the words are similar.

SENTENCE SEGMENTATION:

Finding and segmenting individual sentences.

WORD VECTOR REPRESENTATION:

LIBRARY ARCHITECTURE:

TRAINING:

spaCyā€™s models are measurable and each ā€œchoiceā€ they make ā€” for instance, which part pf speech tag to relegate, or whether a word is a named entity ā€” is an expectation. This forecast depends on the models the model has seen during training. To prepare a model, you first need training data ā€” instances of content, and the labels you need the model to predict. This could be a part of speech tag, a named element or some other data.

SPACY ADVANTAGES:

spaCy gives exceptionally quick and precise syntactic examination (the quickest of any library discharged), and furthermore offers named entity recognition and prepared access to word vectors. You can utilize the default word vectors, or replace them with any you have.

THATā€™S IT FROM MY SIDE WITH SPACY!!MORE TO COME SOONšŸ¤˜..

THANK YOU!!KEEP SMILINGšŸ˜€

--

--