When all else fails, integrate over Email

It feels dirty, but sometimes you have no choice but to send your data over email. We’ve all done it, we’re not saying we’re specifically proud of it but we’ve learned to live with the shame. Here’s how you can too.

Some of these scenarios might be familiar to you:

  1. You need regular data from a vendor product and all it supports is an unsecured, unreliable FTP server or CSV attached to an email. This is probably the most common scenario and, in some industries, still the done thing. Industrial automation, building management systems and smart PLCs are a few examples.
  2. Your resident built-environment engineers or marketing team has put something together that gathers a whole bunch of data without thought of where it’s going to go. They know Excel, they use Outlook so the result is the macro of your nightmares.
  3. You need to move data from point A to B quickly and you simply don’t have the time nor the energy to convince a developer that they need to bang out an API for you at the drop of a hat and with no budget.
  4. For compliance reasons, the legal-eagles need a hard immutable copy of record. Email into a hold/vault/store fits the bill perfectly, ITHO. [1]

Scenario 1 above is the most common I’ve seen so far and, in some industries, still the done thing. Industrial automation, building management systems (BMS), smart PLC controllers, etc. specifically all focus on their control domain and not on their upstream problems. It’s an IT/Engineering divide that is still rife in the controls space. There are still a host of top vendors that produce incredibly sophisticated PLC and BMS systems that rely on CSV to Email/FTP as their upstream fire-and-forget data integration.

Back in the day

There was a time, some 20 years back, when data integration over email was quite the in thing. All the cool kids were doing it. It was almost all there was.

Back then, writing procmail filters was an essential skill. Various software stacks and products simply hadn’t adopted remote APIs. They were cumbersome and complicated (CORBA anyone?) and developers didn’t have the tools or experience to leverage them properly. Nobody really liked SOAP did they?

Firmware and software stacks for constrained hardware environments simply didn’t have the library and framework components for easily enabling APIs so there there often was nothing to leverage, but they certainly did have email and FTP components so that is what you used. This still happens today.

So in classic tradition, we used the tools we had and email became the transport mechanism for signaling events and moving tables of data.

In fact, event signalling and alerting, is still done over email today, we just don’t parse those emails anymore to extract the information we need, we simply read them. We use it for personal communication and not machine-to-machine communication, as it was intended.

The Past Sometimes Doesn’t Die

Fast forward to today and scenario one is less common but not dead and buried, although scenario two and three become more common.

I recently had to shift sales lead contact data from a website to a sales CRM system. The leads were registering intent on a form powered by Gravity Forms on Wordpress and the developers, in a fit of brevity, simply forwarded each submission to an inbox monitored by a sales coordinator who, in turn, sank into a pit of copy-paste despair. It was going to take too long to mobilise the team to build an API integration script into Wordpress or install and test a new plugin to post the data to Copper’s API and I really just wanted to solve the problem over lunch and move on with life.

The short answer was to intercept the email using Zapier’s mail parser. The volume wasn’t that high, the data was terse and structured and Zapier is pretty awesome. 30 minutes later and we’ve got CRM integration over email. No teams diverted, no additional cost of implementation and we all move on in life. I’m not proud of the solution, it still makes me feel dirty but it does work.

Email is not a Bad Solution, it’s just not a Great One.

When you look a bit deeper at what powers Email, it doesn’t look that bad on paper.

  • It is driven by incredibly well established and adopted APIs. SMTP, POP and IMAP have been around for ages and have assimilated into a variety of software stacks, libraries and firmware. It’s everywhere and almost everything can (basically) support it easily.
  • It is asynchronous. Your posting and delivery are separated concerns which can be a nice quality in certain API requirements. Fire, forget (well, not completely forget as we’ll see later).
  • It has certain delivery guarantees. After posting the mail, the data basically gets queued and shifted and queued and shifted and retried and backed off and reported and saved and and and. If the end client is remote, then asynchronous delivery with guarantees is what you need.

But there are some aspects which don’t look good on paper.

  • It is not a lightweight solution. Sure, you’re not building and maintaining the Email infrastructure, you’re getting it from Google, Office365 or whatever other stack you’re using but this doesn’t mean there isn’t overhead. The message size, the persistence queues and the delivery orchestration are still happening.

It’s like using too much plastic in everyday life. There is a recycling economy that kicks in to try and make sure your straw doesn’t strangle a penguin but this doesn’t mean your apples need to be individually wrapped or that your smoothie needs a lid. It is still wasteful. Electrons simply aren’t free [2].

  • Certain newer protocols, extensions and security capabilities are not always standard (and therefore not guaranteed to be supported). The first thing that springs to mind is when StartTLS appeared, it took a number of libraries and firmware a while to start supporting it, the result of which was a weakening of security to make things work.
  • Email provision has evolved to protect itself from SPAM and security threats. This can and does often directly affect deliverability. It is not uncommon that certain emails get blocked at the ISP or the email provider. They are designed to help filter out and squash anything that looks dodgy, unsolicited or weird. Machine generated emails can quite easily get flagged. Fortunately there are some technologies like SPF as well as sender whitelisting that can help (as long as you are able to do this with your ISP/provider)

Some options that will make you feel less dirty

Zapier

I do love Zapier. It is a lovely product that is surprisingly robust, well thought out and above all incredibly useful. Nothing connects A to B as quickly as Zapier. Zapier has an email parser which is just clever enough to handle most use cases. It can be used as one of the steps in your zap to get records in.

Firstly, you create a mailbox, something like “abcdefg@robot.zapier.com”. This is a somewhat disposable email endpoint and will be purpose specific. After that you define your parsing rules to extract the fields from the received email. It’s like a simple visual regexp builder. Then you pipe the stuff into the rest of your Zap.

SendGrid

SendGrid has an inbound parse webhook. It simply POSTs the email body to a URL that you specify. You do have to use their MX record for the (sub) domain though.

Gmail

To my knowledge there is not a “native” way to parse incoming emails and trigger logic directly in Gmail/Gsuite, however, there are a number of add-ons that tackle this subject but I’m not going to highlight anything specific because I simply don’t ever go there.

BYO

Yip. Go Old Skool. Get yourself a Linux box or VM with an email server and procmail or some other scripting solution. I wouldn’t recommend this purely for the sake that it’s not a rapid and effortless solution. If I was going to go down this route I’d probably spend the time more productively by building an API solution.

Don’t be Too Hard on Yourself

Email works and although it’s not specifically intended for data integration it can get the job done in a pinch. Sometimes that pinch is still there pinching away 5 years later.

But then, mankind wasn’t specifically intended to fly through the air or wear high-heeled shoes either but we do anyway.

[1] In Their Humble Opinion

[2] Except for free electrons obviously

A wolf in geeks clothing