Logo for tanaschita.com

Quick tip on how to split text into words with the Natural Language framework in Swift.

Learn how the tokenizer of the Natural Language framework works.

19 Jun 2023 · 2 min read

When it comes to splitting a text into words, the first solution that might come to mind is to simply split the string components by whitespace. But this solution doesn't work for all languages. For example, Chinese and Japanese don't use spaces to delimit words.

Sponsorship logo
Preparing for a technical iOS job interview
Check out my new book on preparing for a technical iOS job interview with over 200 questions & answers. Test your knowledge on iOS topics such as Swift & Objective-C, SwiftUI & UIKit, Combine, HTTP Networking, Authentication, Core Data, Concurrency with async/await, Security, Automated Testing and more.
LEARN MORE

The Natural Language framework tokenizer provides the possibility to tokenize a string ensuring correct behaviour for all languages. For that, the framework provides the NLTokenizer type:

func words(for text: String) -> [String] {
let tokenizer = NLTokenizer(unit: .word)
tokenizer.string = text
return tokenizer.tokens(for: text.startIndex..<text.endIndex).map { String(text[$0]) }
}

As shown above, we initialize the tokenizer with .word as unit and then simply call the tokens method which returns the ranges corresponding to the tokens.

And that's basically it. The tokenizer also accepts other units than words, for example .sentence to split a text into sentences or .paragraph to split it into paragraphs.

Sponsorship logo
Preparing for a technical iOS job interview
Check out my new book on preparing for a technical iOS job interview with over 200 questions & answers. Test your knowledge on iOS topics such as Swift & Objective-C, SwiftUI & UIKit, Combine, HTTP Networking, Authentication, Core Data, Concurrency with async/await, Security, Automated Testing and more.
LEARN MORE

Newsletter

Image of a reading marmot
Subscribe

Like to support my work?

Say hi

Related tags

Articles with related topics

machine learning

swift

ios

How to find synonyms with the Natural Language framework for iOS

Learn how to find similar words or sentences in Swift.

12 Jun 2023 · 3 min read

Latest articles and tips

© 2023 tanaschita.com

Privacy policy

Impressum