I will email you those files... Part 2

Posted by Darin Rousseau | Filed under , ,

In our previous post, we identified a need for a quick web setup that would allow Jeff to send Mary a secured, 15MB PowerPoint presentation.  In this post, we outline some good practices for design and implementation of a web based technology.

Design Criteria

For our design, we want to be as secure as email, if not more.  Our first design goal is to allow multiple bins for upload so that both Jeff and Mary can communicate, but also Jeff and Bob can communicate without crossing files.  Additionally, because we don't initially know the contents of the files, we should make things secure so that if Bob's hacker son Jason happened to figure out how the site works, he still can't get the data being shared between Jeff and Mary.  We also don't want any nosy information technology staffer to be able to open the files either while on the site.

We also thought about making it pretty.  Since people have varying definitions of what pretty is, we made it "themeable".  Those that like a blue background can choose to have one, those browsing or downloading with Mobile devices get a reduced image or bandwidth-friendly theme, etc.

Our Implementation

We started with a database that keeps track of the upload bins and files.  Our original idea of storing files within the database was changed to putting them in a file store, as the database would take a long time to store a BLOB (Binary, large object) of 150MB.  That meant adding a database/file consistency checker, but in the end the file storage was the best solution.

The security was our next focus.  We used a public key infrastructure asymmetric algorithm and keys to associate with each upload bin.  Jeff creates a key unknowingly when he creates the upload bin, and that key is protected by a system-created strong password, and Jeff then passes that to Mary.  When Mary enters the password, the key data can then be unlocked to provide the information needed to decode the encrypted file for download.

This security system also has some interesting applications we didn't plan, and leaves room for additional features.  For example, with only a slight change, we can add a recovery key to get into the upload bin should Jeff and Mary forget the password.  It could also be used for an administrator to ensure that company secrets weren't being published, etc.  We used the same technology as referenced in a previous article, Securing a Secret for multiple readers.  In addition, because the creator and the recipient is the only one with the key, this also means that with this type of implementation, the site could be produced into a secure public service.  There would be no way (other than attacking the algorithm or the password) that a service provider would be able to see the data.  With the installation of an SSL certificate, even the security of the data during transport is more-or-less guaranteed.

In terms of user featires, and based on past experience, we knew that Jeff wouldn't have a lot of extra time to maintain the files in the system.  We planned for him to post the files, and then walk away and have the system maintain itself.  Each file was given an expiry date, and an application that would check and remove expired files automatically.

We added trivial logging, where each file download was tracked - and some spam protection to ensure that someone couldn't attempt a dictionary attack on the login.  On a certain number of attempts, it locks out that IP address for 5 minutes and makes the upload bin unavailable to them.  We could have done this a thousand different ways, too - as obviously this may lock out a company when a particular user was misbehaving.  For some ISP's, it may block out a whole community or City.  Regardless, we have other options here to counteract those type of attacks.

The decoding of the file appeared to be instant relative to the time to actually download the data.  Because we buffer the file, I suspect the loss to decoding is absorbed within the bandwidth completely.  In our tests, we couldn't decipher the difference between decoding encrypted files and just transferring non-encoded files.  The processor on the server was utilized more during decoding though - and so there was a limit to how many downloads could be performed at any time.

Future improvements

One of the issues we have with HTTP transfers was timeouts.  Large files (>250MB) would take a long time to push to the server and would break if network conditions weren't just right.  We can improve that using BITS, or by putting a client control that would chunk-transfer the file in pieces, having the server re-assemble them.  Downloading seemed to be the opposite and was handled well with the browsers that we tested, so the only change would be to use a technology like BITS to have files slowly and reliably downloaded instead of utilizing all the bandwidth we can with the transfer.

The skin is also set via a configuration file at install time.  It may be benefitial if providing this as a public service to have a theme selectable at upload bin creation time.  (Perhaps even allowing customers to upload their own customization file?)

Conclusion

Companies or individuals that end up needing to receive or send large amounts of data can be better served with a tool or tools that focus on the needs.  We see a benefit for this particular technology in operations like Print or Copy centers, engineering firms, software companies, design companies, even business centers in Hotels or airports. 

Having quick and reliable access to data you know is secured makes your data all that more profitable, too.

I will email you those files... Part 1

Posted by Darin Rousseau | Filed under , ,

There are a lot of technologies that we use in the information technology world to get our jobs done.  Most of the technologies are very solid and are as old as the internet, and... Did I mention they are old and robust and reliable and... Not being used? 

In today's world of non-technical management often doing the technical decision making for their companies, we also find that many times, these technologies go unnoticed in favour of something else that "will get the job done for now because that's what I know."  Let's look at one of them : sending data to customers via email.

Most people think email was designed for attachments.  Sure, there is a button that allows that in my email client, but there are far better means to perform that specific operation than within email.  Email (specifically SMTP, the transfer protocol that is email) was never designed to be a file-transfer protocol - no matter what those people tell you.  It is a Simple-Message-Transfer Protocol, in fact.  The technology of email is advanced enough that it can, and did adopt attachments very early on in its life -  but that doesn't mean it is suitable for transferring large amounts of data.  And, as data sizes grow - there are more problems doing it that way. 

An Example 

Take a 15MB PowerPoint sales presentation that Jeff will send to Mary.  Jeff and Mary only know email addresses, and Jeff happily attaches it to the email, and... *poof* it just works, right?  Well, Jeff forgot something...  This particular "file transfer system" (if we are going to call it that) has some important limitations built in that lots of people don't know about.  Jeff may only be able to send 5MB at a time, thanks to his service provider or company.  Mary may be allowed to receive 15MB, but her mailbox only has 5MB free.  Any file transfer system that has these limitations really limits its use - especially when most of the time, the limits are non-negotiable.  Jeff is no more able to convince his ISP to up their limit for a while than Mary is convincing her IT staff to open up her email limits "for now."  Especially when resources are at a premium and may not always be available for them.  (Yes, even big ISP servers have storage limits!)

Now, when we talk about this, we are going to follow the direction of a business-used protocol.  While software or protocols like Torrents may be suited for really large and fast transfers, our example is between one sender and maybe multiple recipients, but not enough to make a torrent really functional.  Certainly there wouldn't be enough seeders for Jeff's PowerPoint presentation to speed anything up in the process.

How about FTP? 

A File Transfer Protocol is what we want.  This protocol allows us to download files and is perfect - just look again at the name of the technology!  However, many of the non-technical people we come in contact with don't understand it, either to use it, or administer it.  When faced with the Windows FTP client DOS window (that's what they call it!) they sit and stare.  It isn't all that simple to use, that is for sure.  Then they may be faced with downloading an FTP client.  Even then - things aren't always just a click away like email.

Web server? 

A web server is just a file transfer server, so would it work?  To our knowledge, no ISP's block web traffic, other than for international filtering or proxying or something like that.  Your browser connects to my server and asks for something.  It sounds like Mary could go to Jeff's site and ask for the presentation...  The prerequisites are that both Jeff and Mary have to know how to web browse.  I would suggest that if they are working with PowerPoint, they at one time had a chance to use the internet.

Out of the box, most web server's don't have the web programming to do this, but it could be done, (and, it only took us only a weekend from concept to fully secure, functional site!).   The problem is that some web coding is often required to do this and most companies don't know how to get started.

We will look at our simple design as an example in part 2...