Logo for tanaschita.com

How to identify parts of speech and names in a text with Swift

Learn to use the linguistic tagging possibilities provided by the Natural Language framework in iOS.

15 May 2023 · 2 min read

As we learned in the developer guide on the Natural Language framework, it supports linguistic tagging. Linguistic tagging, also known as part-of-speech tagging, is the process of assigning grammatical tags to words in a text based on their role and function within a sentence.

We could use linguistic tagging for example to identify parts of speech in a text like verbs and nouns, or names like people names, place names or organization names in texts.

Let's look at how it works.

Sponsorship logo
Capture HTTP(s) traffic with Proxyman
Proxyman - Your ultimate man-in-the-middle proxy to effortlessly capture, inspect, and manipulate HTTP(s) traffic on macOS, Windows, iOS, and Android devices.
Get started for free

The linguistic tagging functionality is provided by the Natural Language framework's NLTagger type. The tagger is initialized with a scheme of type NLTagScheme which specifies what kind of tags should be used for classification.

For example, we could use .tokenType as a scheme, which would classify parts of a text with the following tags: .word, .punctuation, .whitespace and .other.

Another example for a scheme is .lexicalClass, which classifies parts of a text with the tags .noun, .verb, .adjective, .adverb etc.

Let's look at the following function which extracts all verbs from a text:

func getVerbs(from text: String) -> [String] {
let tags = tags(for: text, tagScheme: .lexicalClass)
return tags.filter { $0.tag == .verb }.map { $0.word }
}

The function uses the following helper function which creates pairs of words and their tags based on the provided text and scheme.

func tags(for text: String, tagScheme: NLTagScheme) -> [(word: String, tag: NLTag)] {
var taggedWords: [(String, NLTag)] = []
let tagger = NLTagger(tagSchemes: [tagScheme])
tagger.string = text
tagger.enumerateTags(in: text.startIndex..<text.endIndex,
unit: .word,
scheme: tagScheme,
options: [.omitPunctuation, .omitWhitespace]) { tag, tokenRange in
if let tag = tag {
taggedWords.append((String(text[tokenRange]), tag))
}
return true
}
return taggedWords
}

We can use the same helper function to extract other words, for example names. The following code extracts place names from a given text:

func getPlaceNames(from text: String) -> [String] {
let tags = tags(for: text, tagScheme: .nameType)
return tags.filter { $0.tag == .placeName }.map { $0.word }
}

And that's basically it. The NLTagger makes it really easy for us to identify parts of speech in a text.

Check out the official NLTagScheme documentation to learn more about supported classification possibilities.

Sponsorship logo
Capture HTTP(s) traffic with Proxyman
Proxyman - Your ultimate man-in-the-middle proxy to effortlessly capture, inspect, and manipulate HTTP(s) traffic on macOS, Windows, iOS, and Android devices.
Get started for free

Newsletter

Image of a reading marmot
Subscribe

Like to support my work?

Say hi

Related tags

Articles with related topics

machine learning

swift

ios

Quick tip on how to split text into words with the Natural Language framework in Swift.

Learn how the tokenizer of the Natural Language framework works.

19 Jun 2023 · 2 min read

Latest articles and tips

© 2023 tanaschita.com

Privacy policy

Impressum