How To: Get Open Graph meta Tag Values

Nick_Batt · July 2, 2024, 8:25am

Didnt’ know where else to post this but felt I should share it as its pretty useful and GPT made it work for me. In case you need to get the OG (open graph) tag values from any given URL - I need it to build dynamic content for a newsletter for instance. Hope someone finds it useful.

<!---  The url to test --->
<cfset theurl="http://somewhere.com">
<cfhttp url="#theurl#" result="presult">
<cfset htmlString = '#presult.filecontent#'>
<!--- Initialize a structure to hold the property-content pairs --->
<cfset ogProperties = structNew()>

<!--- Use a regular expression to find all meta tags --->
<cfset metaTags = reMatchNoCase('<meta [^>]*>', htmlString)>

<!--- Loop through each tag to check for og: properties and extract content value --->
<cfloop array="#metaTags#" index="tag">
    <!--- Check if the property attribute contains 'og:' --->
    <cfif reFindNoCase('property="og:', tag)>
        <!--- Extract the property name, ensuring the pattern does not include the closing quote --->
        <cfset propertyName = reReplaceNoCase(tag, '.*?property="([^"]*)".*', '\1')>
        <!--- Extract the content value from the tag, ensuring the pattern does not include the closing quote --->
        <cfset contentValue = reReplaceNoCase(tag, '.*?content="([^"]*)".*', '\1')>
        <!--- Assign the content value to the property name in the structure --->
        <cfset ogProperties[propertyName] = contentValue>
    </cfif>
</cfloop>

<!--- Display the extracted og properties and their content values --->
<!--- <cfdump var="#ogProperties#"> --->
<cfoutput>
<h3 style="background:gray;font-family:Arial;">Open Graph Tags from: #theurl#</h3>
<font style="font-family:Arial;">
  <cfloop collection="#ogProperties#" item="key">
<b>#key# - </b> #ogProperties[key]# <hr>
     
        </cfloop>
</font>
</cfoutput>

bennadel · July 2, 2024, 11:35am

Good stuff!

I would also throw in a suggestion that something like JSoup could be very helpful in this context. Though, to be fair it does require a dependency on an external JAR file which increases the complexity (especially since Adobe ColdFusion makes it harder to load JAR files compared to Lucee). But, it would allow one to take the presult.filecontent, parse it into a DOM, and then do something like a .find(meta) to help locate the meta tags.

Nick_Batt · July 2, 2024, 11:54am

A reply from Ben! Thanks and keep doing your great work, been a long time reader of your posts.

This one is designed only for og: tags as that seems to be the main way many socials grab the info for their cards, although the quality of the data is variable - its useful for thumbnails and title an represents some sort of standard.

bennadel · July 2, 2024, 11:56am

100% I wasn’t pushing back against the solution - just offering some additional perspective I’m a big fan of using RegEx to parse things.

Nick_Batt · July 2, 2024, 2:46pm

For sure, its very powerful if a little complex to get your head around (we me anyhow).