Dear fellow IT Director,
I work for a firm that dropships products to your customers reliably, cheaply, and confidentially. We reproduce your custom designed packing slip, perform QA on every outbound order, and never substitute name-brand goods for knock-offs. We can provide a turnaround of 12-hours or less on business days and inventory feeds updated every 15 minutes so you never oversell our stock (and we never go OOS on the must-haves). We can process your returns and can handle tiny orders, huge orders, bulk freight-shipped orders, partial-fills, custom inserts and international shipping.
This, of course, means messaging. A purchase order has to be submitted to us somehow, so we handle text, XML or EDI, plaintext or PGP, sent over FTP, SFTP, FTPS, AS2 and VANs. We can do Functional Acknowledgements (997), Order Acknowledgements (855), PO Change Requests (860), Oder Cancel Notifications/Floor Denials (865), Ship Notifications (856), Invoices (810), Inventory Availability Advice (846) and Return Notifications (180). In fact, we cover just about everything that begins with an "8" on the EDI message-type chart.
But it took a long time to get to the point where we could handle every cockamamie half-assed ass-backwards Rube Goldberg counterintuitive duck-tape contraption you bastards keep coming up with, because each and every one of you keep reinventing the wheel. Just when we've finished spending a good part of a year designing, building and testing a flexible, configurable messaging system, one of you slurps a sheet of blotter acid and designs a PO system that breaks every abstraction we had developed.
In the course of trying to accommodate our partners' messaging systems, we've encountered:
When you're a supplier you have to support dozens of multicolored file formats and transfer mechanisms, and each one will have a twist that'd make M. Night Shyamalan drop jaw and stare.
"Is the sales tax given as a dollar amount or a percentage, did you include it in the totals or put it in a separate field, and is it per line item or per order?"
Messaging shouldn't be hard. The hard part should be defining the product and how it gets sold, but the above gripe-list is all about stupid designs that did not improve messaging or enable anything new to the business. Here are some tips for making the messaging part easier for you and your partners.
The "Value Added Network" was born in the 1960s when businesses first began to do EDI but the public phone network wasn't adequate, so they ran leased lines to each other through privately owned exchanges. "Hold on," said the Government, "that's a phone network and should be regulated!" But there was a loophole: if the network was more than just a common carrier--if it "added value"--then it didn't need to be regulated, and the main value they added was audit trails*.
Physical private networks are obviously no longer needed today, and third parties are no longer needed for audit-trails thanks to a good Public Key Infrastructure.
A VAN is now an impediment to business:
What you should consider are third party FTP servers, because these will relieve your IT staff of the administrative and security burden of running one in-house. Look for one that supports FTPS and lets you create as many logins as you like with storage quotas. And never store anything there that isn't encrypted or desensitized.
* - VANs also added message routing, authentication and validation, but modern software again makes these less important to outsource.
There are three basic file syntaxes that everyone uses: CSV (or delimited text), XML, and EDI. These are then used to define a format for encoding inventory advice, purchase orders, acknowledgements, ship notifications, and invoices. You should always chose the simplest format that works, and nine times out of ten that's probably going to be denormalized CSV.
"Denormalized" means that if you're giving us a PO with multiple line items then you need to repeat the order header for each line. The overhead this creates is not extreme (text compresses very well), and when we import it we'll group on your order number to reconstitute the PO correctly. Every shop out there has tools for mapping columns to columns and they all assume it'll work this way, so it'll take fifteen minutes for your partner to map three or four different message types. That means faster to test, faster to production, fewer mistakes, and faster to make changes in the future.
Use XML if you need to pass hierarchical data. This is rare because POs are inherently relational in structure, but consider it if you have to describe a complex product or a complex fulfillment process. If you design an XML format then please do not just serialize your internal data structures; please put some real thought into it. My rule that "Nobody who uses XML knows what they're doing" comes from never seeing an XML messaging format that wasn't either grossly overengineered or bizarrely dysfunctional.
As an example, I saw one ship-notification format that got the nesting of values back-to front: the line-items were the first level, the order header was at the second level, and the tracking number was at the the third, like this:
<LineItem LineNumber="1" SKU="123456">
<Package ShipMethod="UPS" Tracking="1Z123467WW53631"/>
<LineItem LineNumber="2" SKU="654321">
<Package ShipMethod="UPS" Tracking="1Z123467WW53631"/>
The designer of this schema was on crack.
EDI is also a hierarchical format, and most shops now translate it to XML as an intermediate step. Sometimes you'll hear people talk about "EDI" as the concept of sending electronic POs in general--regardless of the format (the title of this article, for example)--but most of the time they'll mean ANSI X12. EDI is three formats in one: the base syntax (X12, EDIFACT), the set of standard message types and their revisions, and then the vendor's own tweaks. Do not volunteer to use EDI, only use it if your partner gives you no other choice. EDI cannot do anything better than XML can*, but the tool-chain is expensive and "Enterprisey", putting it out-of-reach for small or under-budgeted shops.
* - With one notable exception: the set of standard EDI message types mean you don't see many acid-trip schemas. The greater plan for EDI was to establish standardized processes for fulfillment, and the file syntax was just one part of it.
Use FTPS to send PGP encrypted copies of messages to-and-from a firewalled drop-box server. All done!
But, do not:
One of the VANs we have to work with has a customized FTP server that moves a file to an unreachable location whenever a client attempts to GET it, even if the attempt fails. After a failed transfer resulted in a 48-hour delay fulfilling an order (by that I mean a 48-hour wait for one of their techs to figure out what was wrong and restore the file) I called them and asked them why they did this, and they said--no kidding--that they couldn't let us delete the files ourselves "in case they had to restore it again later".
This blew my mind.
This is a major VAN and they have the most incompetent implementation of file transfer that I have ever seen*.
An FTP site, even if it's protected with a firewall and FTPS and all the messages are encrypted, is an inherently insecure site, so you must treat it as a drop-box that you place copies of messages on. It's also at the lower-end of the messaging protocol stack and needs to follow the conventions that fit such a level. If you are using FTP then DELEting a file after fetching it is how I can communicate that the transfer was successful. It's very easy for me to do this because my code is:
If there was an exception thrown during the transfer then the remote file doesn't get deleted. If the file doesn't exist locally after the transfer then the remote file doesn't get deleted. If the local copy is zero bytes then the remote file doesn't get deleted. If a man in Brazil is coughing then the remote file doesn't get deleted. Furthermore, the file I'm fetching should be a copy and the VAN should be keeping an archive to restore from. So when I do actually issue a DELE command it's my way of saying--at the file transfer protocol level--that I've got the file and it's dandy. Further up the stack we have things like MDNs and Functional Acknowledgements and Purchase Order Acknowledgements and sometimes all three. There is no need to fuck around with file transfer.
* - A month after this article was written they added a tool to their web-portal that restores zealously deleted files, although I'm not sure if I should be happy; it still forces manual intervention when the alternative is a completely automatic recovery of the fault.
Don't use a 10-year old copy of PGP. It has published vulnerabilities and has compatability issues with newer, open-source PGP implementations. We use Bouncy Castle, for example, which has an implementation of OpenPGP. Others may use GNU Privacy Guard. All of them have issues with PGP 7.2 and older.
It's also common to confuse the transport-level security of FTPS or SFTP with the encryption of the message itself. You want to use FTPS to prevent the login password from being sniffed but that isn't the only way to compromise a host, so the messages need to be encrypted to the public key of your trading partner so that a compromise of the host doesn't mean a compromise of the message. It also means you can outsource your FTP server to a third party without worrying about security. In fact, outsourcing your FTP host will improve security by eliminating another IT headache and firewall exception.
Fulfillment is messy. Sometimes we have to chase down stock that got misfiled in the wrong bin location, reprint a packing slip that got dropped and muddied, change a ship method to one that can actually deliver to the address specified, and so-on. These all require manual intervention to even the best automated process, and it frequently means somebody has to scan an ID number. If the barcode is unscannable or the workstation doesn't have a wand then it has to be typed by hand.
Case-sensitive IDs are not going to survive this failure mode.
They also wreak havoc on databases that were designed years ago on platforms like Microsoft SQL-Server, where varchar columns are case insensitive by default. That means a query on it will fail, joins will fail, and if there was a uniqueness constraint then inserts can fail, too. The IT department will grumble and bitch, put in a ticket for the DBA to change the column's collation to SQL_Latin1_General_CP1_CS_AS and schedule a couple of weeks for all the programmers to go through the code and conditionalize the use of ToUpper().
And if any of the applications are using a .Net DataTable, forget it.
We can handle 15 or 16-digit numbers and we don't care if they're alphanumeric. We're fine with hyphens that break-up long numbers. Checkdigits are awesome, too, but don't expect us to code the logic for them.
We don't have infinite resources and we can't dedicate a team to your issue on the same day you decide to change the spec. Furthermore, you're not the only guys who have a new packing slip style or message type to be supported by next Monday.
ISO 8601 is a standard for expressing date-and-time strings and every modern platform used by IT today (.Net/Mono, Java, Ruby, Python, Cocoa, etc.) can parse it into their native DateTime types. They also spit it out with their respective ToString()s as the default encoding, complete with timezone if it was set. Here's a bit if ISO 8601:
The 'Z' at the end is for the Zulu timezone, the ethnically neutral successor to GMT. Now here's the date-time format of a major VAN:
It seems that at first they were all over dates, and had defined an encoding that stripped all hyphens, slashes and other punctuation. But then... somebody needed to add the time! And the engineers said "Oh shit! Well I guess we can add colon-separated hours and minutes after the date, then". This format appears in both their CSV and XML messages.
If you're lucky then your platform's DateTime.Parse() can figure out this crap and read it without getting the value wrong, otherwise you'll have to do string manipulation on the input. But you'll still have to face the fact that you will have to do formatting on the output, because this VAN's software cannot parse ISO 8601. You have to munge it into their format first.
This also extends to other encodings for common values:
For everything else, I beg you to do a Google search before you invent your own encoding. If we have to create a table to map your proprietary numeric country codes then we will make a mistake and ship your best customer's order to East Timor.
Message files aren't meant to behave like REST URIs. While there's nothing wrong with making REST over HTTP available for us to query, do not try to shove this concept into static files on an FTP site. A file is a message, so it needs a unique name (timestamps or serial numbers are fine) and unique contents. We have partners who thought it was a good idea if "market_orders.txt" was a magic file that always contained every unacknowledged order, refreshed every hour. But all this meant was a constant stream of "duplicate order" exceptions whenever messaging wasn't in perfect sync, or worse, contented for locked resources.
It also made it impossible to discuss a problem with a file because you couldn't tell them which file had the problem: it might have been overwritten with new contents by the time you called them up.
Exposing a REST interface can be very useful to automate error recovery; we can build feedback-control loops to keep messaging in sync. But that doesn't eliminate the need to have every message in a single identifiable package.
Many partners have wisely implemented a case tracking system, which is fantastic. What they didn't do was give it an email address, so to open a case I have to:
Or I could:
One of the biggest messes I ever saw involved a company that managed eBay listings for you and relayed paid orders via their messaging format. The problem occurred when they sent us an order without a shipping address, "oopsed" over it and told us to disregard because it was a test order. A few weeks later the customer behind that order is asking what the hell happened and giving us negative feedback scores.
It's normal to run a full-cycle test once you've linked two production systems: you send us a "real" order placed by one of your staff, pay for it, we pack and ship, then we wrap-up with a return-n-refund to proof the whole system. Because these "proofs" go through a production system they need to be flagged as such, so that if something goes wrong then we can relax and approach the problem from a different angle: something went wrong with messaging, not something went wrong with a real customer's order.
Proof orders are also pathological and tend to hit every branch of the Ugly Tree on the way down: multiple ship-tos, deliberate short-stocks, sales tax, gift messages, returns and refunds, because you want to test all of those conditions. If one of those was to be mistaken for a real order then it might tie up three or four departments in a frenzy instead of just one.
That's what happens when you mistake a test order for a real one. It's an even bigger mess when the opposite happens. Even if you're just a "Market intermediary" you still need the ability to attach meta-data as it passes through your own system.
The classic EDI standard isn't trendy among the kids nowadays; it's seen as a product of the mainframe era: pre-XML, pre-Internet, pre-PKI, pre-Everything. That much has become abundantly clear just from all the different dot-coms who come to us seeking everything except good 'ol ANSI X12. On the downside, no formidable standard has risen to replace it, only a vast wasteland of homemade crap. The tower of babel has fallen once again and as a result we feel like we're playing Twister with rubber-armed mutants and a play-mat with ten thousand colored dots on it, and every color on the spinner is spelled in a different language. And some of them are in EBCDIC. And big-endian.
I am annoyed that everybody and his uncle feel it's necessary to make a boring and mundane part of business "special". It isn't special, it's plumbing. There is nothing to gain and everything to lose by inventing your own format. Forty years after the invention of EDI and you're still all doing it wrong.