A simple convenience wrapper in Swift for data detection from natural language text that organizes data extraction and handling.
MKDataDetector streamlines NSDataDetector and builds on it with additional supporting capabilities that use information effectively.
- Swift 3.1+
- macOS 10.9+
- iOS 8.0+
- tvOS 9.0+
- watchOS 2.0+
There are multiple installation options to choose from.
To install via CocoaPods, add the following line to your Podfile:
pod 'MKDataDetector'To install via Carthage, add the following line to your Cartfile along with the specific version of the framework you might prefer:
github "mayankk2308/mkdatadetector-swift"
- Create a submodule in your project directory:
git submodule add https://github.com/mayankk2308/mkdatadetector-swift.git - Open the submodule directory and drag the .xcodeproj file into your project.
- Add MKDataDetector.framework to your target's Link Binary with Libraries Build Phase.
- You can now use the framework by importing it.
MKDataDetectorService is packaged as a set of extensions that compartmentalize its following capabilities:
- Date - date extraction
- Address - address extraction
- Link - link extraction
- Phone Number - phone number detection
- Transit Information - flight information extraction, etc.
In addition to extracting these features, the framework also provides convenience functions to manipulate and organize this data.
To import and use the MKDataDetectorService class, add the following statement to your .swift file:
import MKDataDetectorYou can use basic functionality as an extension of String:
let testString: String = "sampleText"
// extract Dates
if let dates: [Date] = testString.dates {
print(dates)
}
// extract Links
if let links: [URL] = testString.links {
print(links)
}Similar extensions exist for addresses, transit, and phone numbers as well. For more informative results, you may want to initialize the service.
You can declare an instance as follows:
let dataDetectorService: MKDataDetectorService = MKDataDetectorService()A generic set of AnalysisResult<T> structures is consistently returned for extraction/analysis results. An enumeration called ResultType is also included for identification of results.
AnalysisResult<T> contains 5 fields:
- Source (
source) - the source/original completeStringfrom which data was detected - Match Range (
rangeInSource) - theNSRangeof the matched string in the original string - Data String (
dataString) - the substring from whichdatawas matched - Data Type (
dataType) - the typeResultTypeof data returned - Data (
data) - the dataTextracted from the source input
The generic struct has a typealias per result type:
DateAnalysisResult- forAnalysisResult<Date>URLAnalysisResult- forAnalysisResult<URL>AddressAnalysisResult- forAnalysisResult<AddressInfo>PhoneNumberAnalysisResult- forAnalysisResult<String>TransitAnalysisResult- forAnalysisResult<TransitInfo>
The address and transit information results are structures ([String : String]) typealiased as AddressInfo and TransitInfo. Address and Transit structs with their associated dictionary keys make information lookup simple. For example, to access the zip-code in an extracted address, simply use the key Address.zip.
To extract dates from some text (String):
if let results = dataDetectorService.extractDates(withTextBody: sampleTextBody) {
for result in results {
print(result.source)
print(result.data)
// do some stuff
}
}For a given textBody, the dataDetectorService returns an array of DateAnalysisResult objects.
To extract dates from multiple sources of text ([String]):
if let combinedResults = dataDetectorService.extractDates(withTextBodies: [sampleText, sampleText, ...]) {
for individualResults in combinedResults {
for result in individualResults {
print(result.source)
print(result.data)
// do some stuff
}
}
}For given textBodies, the dataDetectorService returns an array of [DateAnalysisResult] objects.
The extraction process is uniform for other types of data features such as phone numbers, addresses, links, and more.
In cases where detection with multiple ResultTypes are required, the following implementation may be used:
if let results = dataDetectorService.extractInformation(fromTextBody textBody: String, withResultTypes: .date, .address ...) {
for result in results {
switch result.dataType {
case .date:
// force cast as DateAnalysisResult - create and save events, etc.
case .address:
// force cast as AddressAnalysisResult - get location, etc.
.
.
// for all result types you are concerned with, i.e, your input parameters
}
}
}An implementation for extracting multiple types from multiple text bodies is also included.
MKDataDetector also provides handy convenience functions to use detected information.
To retrieve precise location information from a valid String address:
dataDetectorService.extractLocation(fromAddress: sampleText) { location in
if extractedLocation = location {
// CLLocation object available, requires importing 'CoreLocation'
}
}Alternatively, if you already have an AddressAnalysisResult:
dataDetectorService.extractLocation(fromAnalysisResult: sampleAnalysisResult) { location in
if extractedLocation = location {
// CLLocation object available, requires importing 'CoreLocation'
}
}For calendar integration, you can easily create an EKEvent from a DateAnalysisResult:
let event: EKEvent = dataDetectorService.generateEvent(fromEventStore: sampleEventStore, withAnalysisResult: sampleResult)A withEndDate parameter, not shown above, is optional. Not providing a value defaults the event to end after an hour.
A more generic event generator is also available, which may be preferred if 100% event naming consistency is expected. Automatic processing of event names from DateAnalysisResult objects needs more testing.
Given a set of retrieved AnalysisResult<T> (the default result type for any extraction operation), you can generate colored attributed texts:
if let attributedText = dataDetectorService.attributedText(fromAnalysisResults: sampleResults, withColor: UIColor.blue.cgcolor) {
// set UI component
}More convenience capabilities will be incorporated into future releases.
Consider the following inputs:
let meeting: String = "Meeting at 9pm tomorrow"
let party: String = "Party next Friday at 8pm"Extracting dates using dataDetectorService, we receive the following output for meeting:
source= "Meeting at 9pm tomorrow"sourceInRange=NSRangeof the match "at 9pm tomorrow"dataString= the match substring "at 9pm tomorrow"data= equivalentDateobject, specifying source date relative to the current date/time on the device
The output is similar for party:
source= "Party next Friday at 8pm"sourceInRange=NSRangeof the match "next Friday at 8pm"dataString= the match substring "next Friday at 8pm"data= equivalentDateobject, specifying source date relative to the current date/time on the device
The output format will be uniform for other types of data features as well, with the data field returning objects of the appropriate type in each case.
You can easily make use of this data, for instance, by generating an event:
let meetingEvent = dataDetectorService.generateEvent(withEventStore: someEventStore, withAnalysisResult: meetingAnalysisResult)
// creates an event detailing a meeting for 9pm tomorrow, lasting an hour
let partyEvent = dataDetectorService.generateEvent(withEventStore: someEventStore, withAnalysisResult: partyAnalysisResult)
// creates an event detailing a party at 8pm next Friday, lasting an hourAssume that the meeting text was embedded in a UILabel called meetingLabel. It was also expanded to add " and next Friday at 5pm". You can update the label to display the multiple detected parts of the text:
if let attributedText = dataDetectorService.attributedText(withAnalysisResults: meetingAnalysisResult, withColor: UIColor.purple.cgcolor) {
meetingLabel.attributedText = attributedText
}meetingLabel will now display the original text with the detected information in purple (bold here):
"Meeting at 9pm tomorrow and next Friday at 5pm".
Apple's documentation on NSDataDetector states that the class can currently match dates, addresses, links, phone numbers and transit information, and not other present properties such as grammar and spelling.
Additionally, NSDataDetector does not detect:
- the name, job title, organization & phone number components of an address, although keys for the same are provided within the original API
- the airline name component for transit information, although a key for this is available in the original API
You can contact:
- Mayank Kumar - via email or LinkedIn
- Jeet Parte - via email
for any inquires or community-related issues.
This project is available under the MIT license.
