How to parse each sentence as a list item?

What delimiter to use for a list that has multiple sentences and we want each sentence as one element? data looks like this (it has 2 elements):

verjet, a startup focused on using AI to help dentists and insurance companies understand dental scans, today announced that it has raised $7.85 million in what it describes as a seed round. According to Overjet's CEO Wardah Inam (an MIT PhD in electrical engineering and computer science), the company raised the funds from Crosslink Capital, which led its round, and E14 Fund, which "only invests in MIT startups," Inam said.

? delimiters="." won’t suffice.

thanks.

Unfortunatly I haven’t got a valuable answer to your question. Maybe a regex will work with your string but fail with another very valid good sentence, because parsing sentences is a not a trivial task. Please see these posts about why:

1 Like

Thank you @andreas I’ll into them.

I’m confused. So your example is one element and you want to split to each sentence out to it’s own? Maybe using “.” to use as a delimeter or replace “.” with “.|”?

Read the original text CAREFULLY.
verjet, a startup focused on using AI to help dentists and insurance companies understand dental scans, today announced that it has raised $7.85 million in what it describes as a seed round. According to Overjet's CEO Wardah Inam (an MIT PhD in electrical engineering and computer science), the company raised the funds from Crosslink Capital, which led its round, and E14 Fund, which "only invests in MIT startups," Inam said.

HAS 2 sentences.

Sentence 1:
verjet, a startup focused on using AI to help dentists and insurance companies understand dental scans, today announced that it has raised $7.85 million in what it describes as a seed round.

Sentence 2:
According to Overjet's CEO Wardah Inam (an MIT PhD in electrical engineering and computer science), the company raised the funds from Crosslink Capital, which led its round, and E14 Fund, which "only invests in MIT startups," Inam said.

@justaguy, I just use the below code to split up the text into sentence.

<cfset list = "verjet, a startup focused on using AI to help dentists and insurance companies understand dental scans, today announced that it has raised $7.85 million in what it describes as a seed round. According to Overjet's CEO Wardah Inam (an MIT PhD in electrical engineering and computer science), the company raised the funds from Crosslink Capital, which led its round, and E14 Fund, which only invests in MIT startups, Inam said.">
<cfset listRes = REReplaceNoCase(list,"[.][ ]",chr(174),"all")>
<cfloop list="#listRes#" delimiters="#chr(174)#" item="i">
	<cfdump var="#i#" />
</cfloop>
1 Like

Beautiful, thank you @cfmitrah and I’ve also taken into account a sentence break (.) when it can occur at the end of a paragraph as well.