Author: MikeM

Handling OPTIONS requests in rust using Rocket, with CORS

In setting up a couple of endpoints using Rocket (written in rust), I was able to easily get up the GET and POST handlers that I needed. These worked fine with all my testing in Postman.

However, when it came to real world testing, the browser was getting blocked due to CORS. So I found this Stack Overflow article, and was able to implement the CORS headers based off that:

https://stackoverflow.com/questions/62412361/how-to-set-up-cors-or-options-for-rocket-rs

This worked. Up to a point.

That point was the browser sending a preflight OPTIONS request, which is neither a GET or a POST. After floundering around with various efforts, such as trying to put a wildcard #[options()] handler (which never got picked up by the compiler for some reason; kept getting the message that the following function was unused), and so on, I finally landed on the solution.

Taking the recommended implementation of on_response():

fn on_response(&self, request: &Request, response: &mut Response) {
   response.set_header(Header::new("Access-Control-Allow-Origin", "<em>")); 
   response.set_header(Header::new("Access-Control-Allow-Methods", "POST, GET, PATCH, OPTIONS")); 
   response.set_header(Header::new("Access-Control-Allow-Headers", "</em>"));
   response.set_header(Header::new("Access-Control-Allow-Credentials", "true"));
}

All that was needed was to add a few lines:

async fn on_response<'r>(&self, request: &'r Request<'_>, response: &mut Response<'r>) {
	response.set_header(Header::new("Access-Control-Allow-Origin", "*"));
	response.set_header(Header::new("Access-Control-Allow-Methods", "GET, POST, OPTIONS"));
	response.set_header(Header::new("Access-Control-Allow-Headers", "Content-Type"));
	response.set_header(Header::new("Access-Control-Request-Method", "GET, POST, OPTIONS"));
    
	if request.method() == Method::Options {
		let body = "";
		response.set_header(ContentType::Plain);
		response.set_sized_body(body.len(), std::io::Cursor::new(body));
		response.set_status(Status::Ok);
	}
}

This worked to get the proper headers in place, and also overcome the 404 errors. The 404 errors were coming because I had no explicit #[options()]; route handler in place – they just weren’t taking for some reason, as mentioned above. So the last line, response.set_status(Status::Ok); is what fixed that problem. This allowed the client to send an Options preflight request, which Rocket responded to with all the right bits, and without a 404 status coming back.

And to cut down on the length of messages about these requests in the shell, I implemented a catcher too, which are still present, due to the missing route handler:

#[catch(404)]
fn not_found(req: &Request<'_>) -> String {
	if req.method() == Method::Options {
		return "".to_string();
	}
	format!("Sorry, '{}' is not a valid path.", req.uri())
}

Attach the handler, and, bang! we’ve got a working set of endpoints handling CORS requests, with preflight Options requests. Rust with Rocket is very performant too – a release build responds with an average time of 63 milliseconds, most of which is probably the database query time.

Converting SVG to Canvas, with styles, using canvg

Canvg is a utility to convert SVG objects to canvas objects, which can then be further used as images or whatnot.  Trouble is, it only converts the SVG elements, not the CSS applied to it.  So the top SVG becomes the bottom canvas in the conversion.

2018-06-19_9-47-38

Not what was needed.  The issue is that the blue in the SVG isn’t embedded as style tags on the <rect> elements; it’s applied by the browser from the CSS stylesheet.

Luckily, in my project, there are only a handful of CSS rules dealing with the SVG elements.  So the solution was relatively simple.  Insert the styles into the SVG before proceeding. SVG honors a <style> tag.

var svg1 = document.getElementById('svg_item'),
width = svg1.width.baseVal.value,
height = svg1.height.baseVal.value;
$('#canvas').width(width).height(height);
var source = (new XMLSerializer()).serializeToString(d3.select('#svg_item').node());
var styles = '

';
var first = source.indexOf('>') +1; // First closing tag
source = source.slice(0, first) + styles + source.slice(first);
canvg('canvas', source);

What this does is inline the styles directly in the SVG string passed to Canvg, before processing it.  Turning the first two tags of the SVG:

into this:

2018-06-19_10-02-04

Much better.

PuTTY, Pageant, and WHM (SSH Keys and how to use them)

How to use PuTTY to generate SSH Keys, install them on a WHM server (or cPanel account) and use Pageant to manage the keys

Because it took me a while to figure out how to do this right, I wanted to document the whole routine for my future self and anyone else who needs it.

First, if you haven’t already, download and install putty – http://www.putty.org/

Open the folder C:\Program Files (x86)\PuTTY.

putty-install-folder

Using Puttygen.exe to generate an SSH Key

We’re going to create an SSH key that will identifty you to the cPanel/WHM server.  Open up puttygen.exe (circled above), and click on Generate (circled below):

puttygen-generate

It will ask you to move your mouse around to generate some randomness, in the blank area provided.  This has to do with having enough entropy to generate a cryptographically secure key:

puttygen-randomness

Once it’s satisfied, it’ll generate a key, and look like this:

puttygen-passphrase

Now you’re going to give it a passphrase and save it.  When you save it, name the key something useful so you can identify what server or user it belongs to in the future. And name the public and private keys with the same name before the extension, so you’ll later know that they belong together.

  1. Enter a passphrase (this will be asked for when you use the key).  This can accept spaces, so you could use a sentence if you want to.
  2. Save the public key (give it the extension .pub).
  3. Save the private key. (it will automatically have the extension .ppk)

SSH Keys disappearing in Windows

NOTE:  If you chose to save the keys in the putty directory, you’ll see that the keys don’t appear in the folder.  It’ll look like nothing ever happened.  This is made even more mysterious by the fact that puttygen.exe, and pageant.exe CAN see the files there.

putty-windows-ssh-keys-missing

Here’s what’s happening: Windows hates you.  Windows thinks you’re really dumb at this point and is trying to protect you from yourself.  The files, under the covers, have been silently relocated.  You’ll now find the keys in

C:\Users\{USER}\AppData\Local\VirtualStore\Program Files (x86)\PuTTY

Don’t ask me why.  Microsoft didn’t include me in the meeting about this, but there they are.  I also wasn’t invited to the meeting where they explain how puttygen and pageant CAN still see the files in the location where you thought you put them. I really feel like I should be in on these meetings; I could save users a lot of headaches like this.

Installing the key on the WHM/cPanel server

This is exactly the same whether you’re doing it through WHM or cPanel.  The screenshots are basically the same.  The only difference is the level of access.  If you’re installing root keys, it must be done through WHM.  Account level keys are done through the cPanel account.  Screenshots are for a cPanel user.

In WHM, go to Security Center > Manage root’s SSH Keys.

In cPanel, go to Security > SSH Access > Manage SSH Keys.

In either case, the button you’re interested in is Import Key.

cpanel-import-ssh-key

Put the same basename you used to name the key, as the key name.  Paste in the contents of the .ppk file (not the file itself – the contents.  You’ll have to open the key file with notepad or some such to do this).

While the Windows format of the private key file is different from the linux (OpenSSH) format, don’t worry about that.  There’s no need to convert it – WHM/cPanel is smart enough to take care of that for you.  WHM/cPanel likes you.  Also, don’t worry about entering the public key file.  The .ppk file has both the public and private keys inside of it; just put in the contents of the .ppk file into the private key box:

cpanel-ssh-key-data

Now WHM/cPanel has your key in place.  But you still have to authorize it.

cpanel-manage-ssh-keys

Click on Manage, then Authorize, and you’re all set.  You can also deactivate a key just as easily.

Then delete the private key; you only uploaded it because WHM / cPanel doesn’t always seem to take the public key version PuTTY generates, but it does take the .ppk version and parse it correctly into the public and private keys.  You don’t want your private key on the server, just the public one.

You now have an SSH Key on your computer, and have installed it on your server.  Now let’s make it easy as pie to work with.

Setting up Pageant to manage your SSH Key(s)

Pageant is PuTTY’s SSH Agent.  It will validate you via the passphrase you entered, and so long as it remains running, every time you open a new SSH connection (or SFTP session), it will present the correct key and passphrase to the server, so you don’t have to enter your passphrase on each new connection.

Go back to C:\Program Files (x86)\PuTTY and double click on pageant:

pageant-ssh-agent

This will open up the program.  Click on Add Key, guide it to the .ppk file that you created earlier, and enter the passphrase. You’re done.  Now every time you open a new SSH connection (or SFTP session) with the corresponding username, the login will take place without asking you for a password.

Now, to make this even better, let’s make pageant start up automatically when Windows starts.  First, create a shortcut to pageant.exe and save it to your desktop.  Now navigate to your startup folder. This is a folder where any programs or shortcuts placed in this folder will be triggered automatically on startup.  It’s usual location is

C:\Users\{USER}\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup

Copy the shortcut into the startup folder:

pageant-run-on-startup

Now right-click on the shortcut and click Properties.  We’re going to have it load the key(s) you want on startup automatically.  In the Target field, add the full path to the location of your key(s) after the first entry.

pageant-shortcut-properties

The entries are strings surrounded by double quotes, with a space in between.  So for instance, to load our new key automatically, the full line of Target: will look like this:

“C:\Program Files (x86)\PuTTY\pageant.exe” “C:\key location\fakekey.ppk”

Or more keys:

“C:\Program Files (x86)\PuTTY\pageant.exe” “C:\key location\fakekey.ppk”  “C:\key location\anotherkey.ppk”

Then, the next time you start your computer, you’ll see pageant ask you for the passphrase for each key being loaded.

pageant-enter-passphrase

Enter it now, and for the rest of the time pageant is up you’ll be logged straight in to the corresponding account.  Magic.

And you can manage your keys anytime quickly via the system tray.  You’ll find this little guy in there; that’s the pageant icon.  Clicking it will bring up the pageant program where you can manage the keys it’ll use.

pageant-system-tray-icon

Open up a PuTTY session with your username (either stored or entered at the command line, and you’ll see you’re escorted through with no password:

putty-session-with-pageant-ssh-keys

You’re welcome.

 

 

PHPExcel using php://output resulting in a zero byte Excel file

Today I had to debug something I hadn’t run into before.

I had some PHPExcel code that had been running fine, outputting Excel files without a problem. This code had been successfully working for some time. If you’ve used the PHPExcel library before, you’ve probably written code that looks like this:

header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=' . $filename . ".xlsx");
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
$objWriter->save("php://output");

Well, this stopped working recently. No error messages, no idea what was going on.

Turns out the problem was open_basedir. I’d recently done a litte tightening of the security on the server, and had implemented the php.ini directive open_basedir. If you’re not familiar, this is a directive that keeps php scripts from playing outside their own sandbox. For example if you’re running from /home/account, you can’t read or write to any directory outside of /home/account.

Well, the problem came from that php://output in the last line. I couldn’t find any documentation on the innards of how it’s implemented, but it must be operating outside of the /home/account directory. Makes sense, since it’s a global php facility. So this had my code hosed up so I couldn’t output an Excel file.

The solution? Simple. Write the file out to the filesystem, then read it back in. With a little bit of a delay to give it time to make sure the whole thing is written before reading it back:

$temp_filename = "/home/" . $account. "/temp/" . $filename . "-" . microtime() . ".xlsx";

header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=' . $filename . ".xlsx");
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');

$objWriter->save($temp_filename);
sleep(2);  // Give the system time to write out the file
if(file_exists($temp_filename))
{
	$data = file_get_contents($temp_filename);
	unlink($temp_filename);
	echo $data;
}

This bypasses the php://output facility, and makes sure that the code is operating fully within the allowed path(s) defined by open_basedir.

JQuery infinite loops on change event – how to prevent

Let’s say you have three date fields on a page, and all three need to be kept in sync.  Additionally, each time a field changes, you need to trigger a ‘change’ event.  This change event then triggers the the same action on another field, and you’re into an infinite loop until you run out of stack space and crash.

Updating all three fields itself doesn’t cause a problem.  It’s the triggering the ‘change’ event that gets you into an endless loop.

This can be solved by saving a flag in localStorage.  sessionStorage would work fine too, or any others like localdb, etc.

What we want to do is test whether we’re on the initial change, or one of the changes triggered by our code.  So we store the millisecond we start, and test against it when the additional ‘change’ events get triggered.  The later ones will be at a slightly different time.

$('#date1, #date2, #date3').on('change', function(){
	var date = $(this).val();
	var d = new Date();
	var now = d.getTime();
	var upd = localStorage.getItem('updating_date');
	if(!upd) {
		localStorage.setItem('updating_date', now);
	} 
	
	// Update the fields
	$('#date1').val(date);
	$('#date2').val(date);
	$('#date3').val(date);
	
	// Only trigger the change if we're on the initial call - one millisecond later we won't be, preventing the loop
	if(localStorage.getItem('updating_date') == d.getTime()) {
		$('#date1').trigger('change');
		$('#date2').trigger('change');
		$('#date3').trigger('change');
	}
	setTimeout(function() {
		localStorage.removeItem('updating_date');
	}, 1);
});

Yes, I know. I’m re-setting the value and re-triggering the change event on the initial element again. For this use case I didn’t feel compelled to weed out the duplicate.

The solution for a client who wants you to do all the little changes they could do themselves

I’ve already expressed my thoughts on using frameworks and CMSes, and how these are a poor fit for web use in many cases.  In that rant, however, I only touched on another, deeper issue involved.

Here’s the situation.  You develop a website for a client, and give them their own admin portal (wp-admin for example) to make their own updates and changes.  They asked for this.  They wanted to be able to control things themselves, they claimed.  So you obliged.  You gave them their site built on top of a framework or CMS or whatever, and handed them the keys.  They were delighted, and you thought that was going to be that.

And then two weeks later they want something changed.  Guess what happens?

In theory, they could make this change themselves.  After all, you’ve given them an admin panel to be able to do whatever they want.

So who ends up making this change?

You do.

After all the work to give them a website with all the possible customizations that they could ever want right at their fingertips, after giving them the total power and control they said that they wanted, they still come to you for changes they could make themselves.

Ugh.

How does this happen?  It’s simple, really.

There are two main paths to creating a website.

  1. Roll up your sleeves and learn HTML, CSS, JS, Etc.  Build whatever you want.
  2. Use a CMS or framework, which promises to do ‘everything you could ever need without any coding!’

The problem is this.  The learning curves for both of these are too steep for your average business owner. Your client has other fish to fry.  Their customers are calling them, ordering pizzas or scheduling cleanings, or whatever it is they do on a daily basis.  To your client, you are the one who handles the web stuff.

While the learning curve for CMSes is lower (in many cases) than learning HTML, it’s still not something most businesses are going to bother with.  And why would they?  They have you to lean on, and you’re the expert. So if they need something changed, even though they demanded that they be able to maintain it themselves, even though they said they wanted to be able to control it, at the end of the day, very few clients actually want to take on the maintenance themselves.

It boils down, really, to willingness to assume responsibility.  They don’t want to go into an environment that they’re not familiar with, and start making changes.  Even though how to do things in the backend may be plain to you, the client is afraid of messing things up.  They want you to take the responsibility for the website.

So what are you to do?

Simple.  Take responsibility for the website.

Be their guy to make whatever changes are needed, but understand that along with responsibility comes the concept of control.  You can tell clients what is and isn’t a good idea on their site.  You’re the expert.  So be the expert.

I will tell clients ‘no’ when they ask for changes that would shoot them in the foot.  Just flat out “No, we’re not going to make that change.  It’s a bad idea and I won’t let you do that to yourself.  Sorry.”

This may surprise you, but they don’t get offended. Instead they realize that they have someone watching out for them with their best interests in mind.  They are glad, and so far there have been no negative repercussions for having said ‘no’, since it’s always with good reason, which I explain to them.

“No, you’re not going to play music to the visitor on each page.  You may love this song but people despise uninvited music on web pages.  Proven fact.”

“Animated gifs – uh, no.  You want a site that looks modern and professional, not like it was designed by a kid in his parent’s basement in 1998.”

“Technically, yes.  I could have a penguin moonwalk backwards across the screen to ‘Eye of the Tiger’ wearing lipstick and holding a teapot, but – no.  We’re not going there.”  (okay, I made this one up)

Sometimes this puts you at odds with standard practices.  A popup asking the visitor to sign up for your newsletter as soon as the site loads?  Personally I HATE those, and just click through them; then remain slightly annoyed with the website for the imposition.  Turns out most people feel the same way.  And yet this distasteful practice is standard all around the web.  I won’t do it; I tell the client so. “If you want to gain subscribers, you have to offer them something of value; something that interests the user, in exchange for their email address.  That popup has to be in response to a user action, not rammed down their throat in exchange for the privilege of visiting your site.”

The first few times I did this I was slightly concerned I could lose their business.  I could just imagine a painting company saying “No, I won’t paint your office lime green.  You have to go with a more neutral tone” being fired offhand.

Frankly I am willing to walk away, rather than be associated with something I think is doomed to fail.  If you don’t think it’s a good idea, wouldn’t you feel dirty going along with it, just to keep the client?  I’d rather walk away than have my name on something I thought was garbage.

However, it’s never come down to that though.  In fact, it’s the other way around.  The client is happy to get the advice.  They are glad to have some direction in a world where they are not expert.

I’ve really begun getting adventurous with taking responsibility for a client’s website. I’ve repeatedly gone so far as to search for, find and BUY the domain for their website, without even consulting them.  I tell them after the fact, here’s your website, the address is xyz.com.

When I first began there was no way I would presume to take such a leap on behalf of a client.  But I can’t tell you how many times I’ve had the experience of going through months of agonizing over a domain name. You research possible names that are available, you give them the list.  Weeks go by.  By the time they land on what they want, the domain’s been snapped up and you’re back to square one.  Perhaps you’ve had a similar experience.

Or worse, you ask them what their domain name should be.  Might as well ask them to pay you in moon cheese. Good luck with that.

One would think that there’d be blowback from presuming to pick a domain name and register it for a client.  Nope.  Not so far.  Not one has even batted an eye at it.

After all, you’re the one taking responsibility for this.  So be that guy, all the way down the line.

This also gets into technology, where we started.  Again, you can build a website with HTML, CSS, etc, or you can use a framework.  If you’re going to be taking responsibility for their site, pick the technology that you feel is best for the project.  And tell the client that’s what you’re using, and tell them why.

My feeling is that there is a point at which client involvement in the process becomes burdensome.  To both parties.  If you go to the client to ask how they like each individual element of their project, not only are you setting yourself up for endless, needless meetings, but you’re probably also hampering the client from getting on with their business too.  After all, they hired you to make these decisions, why would you keep asking for their direction on minor details?

The big picture?  Yes.  You need to know about what they want.  But don’t then keep pestering them.  You get on with the website and let them get on with their business. Once they see it, if there’s something they don’t like, they’ll tell you.

Imagine going to the dentist, and being constantly asked how you want things done? That would be absurd, and you’d leave with the idea the dentist wasn’t a professional who knew their business.

If the client demands a CMS where they can maintain things themselves, fine.  Have the discussion.  Find out if they have any real plans to maintain their own site.  If not, tell them no.  Or yes, or whatever you want.  The needs of some clients may never stray outside of what a framework can deliver.  So maybe use WordPress or a CMS, despite the fact that the performance won’t be the same, particularly if they blow up in popularity.  It’ll also hog-tie the site into the limits of the framework, but again, the client may never need to stray outside those confines.  I probably wouldn’t do it, but that’s me.

But the point is that at the end of the day, you’re the one who is going to be taking responsibility for the site. So stand up on your own two feet and do so.

Your client will be glad you did.

 

The case against server-side frameworks

I’m writing this post in WordPress, something I never thought I’d be doing.  I personally try to avoid such frameworks.

Why?  It’s simple.  Most of the work I do is custom coding; developing applications involving logging in and accessing and manipulating data that businesses and their clients rely on.  That’s not a good fit for a WordPress site.  Hammering such work into such a framework would be painful to do and even worse to maintain.

I don’t use WordPress.  I don’t use CakePHP.  I don’t use the ZEND framework.  Joomla.  Magento.  Countless others.  (Yes, I’m throwing frameworks, shopping carts and CMSes all in the same box – sue me)

I don’t use frameworks, and most of my work is  (gasp, get ready) – procedural code.  Horror!  Not even Object Oriented.  Pure insanity.

I write my code from scratch, and it’s procedural.  And it runs very, very fast.

Let’s take a look at a few facts.  Writing a class in PHP, and defining a method in that class, and then calling that method to execute some block of code runs 33% slower than simply writing the code in a function and calling it directly.  33%.  Sounds almost trivial, right?

Let’s scale that up.  Let’s say you’ve got a whole library of classes you’ve developed over the years, and pretty much everything you do is using an object of one type or another.  To load a single page you might have fifty or more methods that get called and executed.  That 33% could mean the difference between a three second page load and a two second page load.  Okay, still no big deal.

But now let’s add in one of these pre-packaged frameworks.  Wordpress.  Magento.  Drupal.  Choose your poison.

Now, your essential bit of code gets wrapped inside a larger framework.  This framework is responsible for drawing the page that the visitor to your site eventually sees.  But the frameworks also offer something else.  They offer a promise to someone with no coding experience, that by using this framework, they’ll be able to easily create and maintain their own website, without having to hire someone.  Sounds nice. (More about that in another post)

How do they offer this convenience?  Easy.  Everything goes in the database.  The user uploads an image to appear on every page.  The file gets stored in the filesystem, and a reference to its name and location gets stored in the database.  The user gives a name to the site, and that gets stored in the database.  The user defines what the items on the menu are, and each one gets a corresponding entry in the database.  So far, so good.

But now, when it comes time to draw the page and show it to the visitor, the framework has to go to – you guessed it: the database.  Over and over again.  Depending on the complexity of the site, and the number of plugins used, a WordPress (Magento, Joomla….) site can hit the database three hundred (or more) times just to figure out what the page is meant to look like, what the content it’s serving is, what the footer contains, etc.

As far as performance goes, retrieving a value from a database stored on a platter drive (still the predominant technology, although SSD drives are becoming more common on servers) ONE TIME can consume fifty million CPU cycles.  Multiply that by possibly three hundred database calls.  Keep in mind this is just to draw ONE page, one time, for one visitor.  Now say you have a somewhat popular site and fifty or a hundred people want to view that page in the same second.  Wow.  Suddenly your site is brought to a crawl.

Instead of serving HTML, it’s now compiling HTML with the help of a (slow) database.

I’m sorry, but I find this insane.  Remember, no matter the underlying technology used, what goes to the browser is still HTML. You can use WordPress, Joomla, Magento, CakePHP, Zencart, whatever, and at the end of the day, what it’s going to send to your visitor is HTML.  (Yes, yes, and CSS, Javascript, images, etc.  But all those get called by the html to begin with)

Listen, the header image of your site didn’t change in the last hour, and in all probability, it won’t change in the next two weeks.  So why is WordPress hitting your database to determine what your header is?  Because it has to.  Because it’s a one-size-fits-all-and-you-can-maintain-your-own-website! framework.  Same for the menu.  Instead of just firing the menu back as part of the page, it’s compiling the menu.  The same menu it served up one second ago to a different visitor, it’s now busy re-computing, to come up with the same answer it had one second ago, again.  And again.  And again.

And it’s Object Oriented.  Yay!  Whoo hoo.

I’ve seen WordPress sites (hosted on fast equipment) take ten seconds to load on a fast connection.  And Magento sites take twenty+ seconds to load a page.

The worst instance of horrible performance I’ve ever seen was a ZEND framework site that literally takes upward of forty seconds to load a single page.  Sometimes over a minute.  Time enough for a pee break between the click and the page load.

I had the opportunity to look at the codebase that was responsible for this.  It was beautiful, well-formed, best-practices-heavily-adhered-to, MVC nightmare.  To trace how a page came to be born you had to weave your way through a multitude of short bits of code.  The guy I was doing this for had been tasked with the job of changing the styling.  Just the look and feel of parts of it.  Lord have mercy on his soul.  This poor man (not a coder, but a HTML/CSS guy) had to go through endless small little files to find where a certain element had came from just to piece together how this go-make-some-coffee-while-your-page-loads monster came to be so he could change it.  Ugh.

Call me crazy, but I think there’s a better way.  I understand the idea behind an MVC framework.  Honest, I do.  I understand the principles of Object Oriented software.  Really, I swear I get it.  I even use it in it’s proper place.

local applicationWhen I do .NET development it’s all Object Oriented.  That’s the way it’s done in that environment.  But consider the use case.  An application that runs on a user’s own computer is a single copy of the entire framework and everything that’s needed on the user’s own hardware.  This is an important point.  If your application takes up 5-10% of the resources on the computer (not too hard to do by the way) that’s fine, because the CPU isn’t busy serving twenty other similar requests in the same second.

That’s your application running on a personal computer.  Even if you’re tying up 10% of the CPU cycles and  20% of the RAM, so what?  Everyone’s happy.  Use Object Oriented code.  In fact you kind of have to.  Go to town with MVC frameworks.  It’s not going to make a big performance difference.

But doing that on a server is, in my humble opinion, MADNESS.  You’re dealing with a SERVER.  It’s meant to serve files, data, images, whatever.  And it’s meant to do it in quantity.  Over and over.

To give your server the additional job of hitting the database to find out what the page is meant to look like, calculating and compiling it all, and, to compound the felony, to do it all through OO and MVC frameworks, fuck, you’re just asking for trouble, aren’t you?

users requesting website

Let’s back up.  Let’s look at real-world facts.  Almost all sites are hosted on shared resources.  By resources I mean the hardware the site runs on – the CPU, RAM, hard disk, etc.  The few sites that actually command their own hardware are rare.  You’re most likely going to be on a server with other websites.  Whether that means you’re on a box with other clients of the guy/company who put your site together, or if you go with hosting from Godaddy, Bluehost, Hostgator, etc. (all terrible options, by the way), you’re going to end up with other sites competing for the resources of the server.  And I dare say that most of those sites also run on some you-can-build-and-maintain-your-own-site framework too.  I feel bad for you.  Really.

When a visitor comes to your site, the shit hits the fan.  Guess what, your visitor has to wait.  Why?  Because the server you’re on is busy with the last thirty requests that came in within the last second.  Probably not for your site, but for one of your roommates (most hosting companies throw a lot of sites on one box).  The CPU you’re depending on is busy calculating what the menu is to look like for a site.  Over and over.  Eventually it will get to your visitor’s request.  Eventually.

Maybe you’re lucky.  Maybe your visitor only has to wait a second or two before it starts calculating what your menu looks like this exact second, which is, by the way, exactly what your menu looked like three seconds ago when serving another request for your site.

I could go on, but you probably already get the point.  Why do this to begin with?  Why not man up, put your big boy pants on and code the HTML that is to be served to the visitor directly?  At the end of the day, that’s what the request will result in.

HTML.  Hyper Text Markup Language.  That’s what’s going to the browser no matter what you put in between the request and the response.

Coming up with the same answer, over and over again

Dynamic (not always the same) pages require some additional technology, like PHP to decide what the different parts are going to be.  But why lean on the slowest technologies you can find (database calls) to determine what’s already known?  Fuck, you haven’t changed your header in a month.  Why is your server, today, going to the database again to draw the same damn menu?  I find this arrangement stupid.

what is your server doing

True, the days when you could just code pages in straight HTML are long, long gone.  But let’s be reasonable.  We don’t have to hit the database to draw the same header, same footer, same menu every time.  Dynamic sites require some processing, no doubt.  But extending that dynamic-ness into every single corner of your website?  That’s taking a good thing and killing yourself with it.  Your body can’t live without water, but did you know that you could die if you drank three gallons in the next ten minutes?

If you want a faster website, you’re going to have to grow up.  You can’t run to mama (WordPress, Joomla…) to save you.  True, there are caching options, and other performance upgrades that can help speed up a sluggish site or server.  But we need to stop making servers hit databases for answers that don’t have to be stored in databases.

And Object Orientation, MVC, frameworks, these are all good technologies.  But they were designed for a different use.  They were designed for software that runs on a client’s OWN MACHINE.  Photoshop is going to run one instance on your computer.  Not twenty instances per second.  I personally don’t feel that OO or MVC have any place being run on a server.  At all.  The resources the Objects and frameworks tie up are too precious on a server.  Other than bandwidth (and all the stupid bloated content many sites serve), this is why you have to wait ten seconds for somes page to load on your phone.

I don’t know about you, but personally, if I’m waiting ten seconds for a page to load, I’m gone.  Doesn’t matter how interesting the link looked, I’m not waiting that long for the content.  Sorry; I can’t be bothered to wait.

The great crime, mixing presentation and logic

The way I do things might look pretty weird to other coders.  They might think my coding was ancient, out of date, and that it belonged on the Antiques Roadshow.  But my code runs fast.  By not using OO (except in libraries where I have to), by not using an MVC framework on the server, I save CPU cycles.  Lots of them.

The common elements of a page are in a header file containing straight HTML.  Same with the footer.  Gasp – straight HTML?  Yep.  That means I can’t rely on a framework.  Oh well.  At least the server isn’t doing any calculating.  Guess what, if the header has to change I’m the one who has to do it.  Time for the big boy pants.

The parts that are truly dynamic are written in PHP.  And may the gods of ‘best practices’ show me mercy – they’re written without the benefits of Object Orientation.  The horror.

And, even worse (I’ll probably be burned at the stake) I’ll mix HTML in with my PHP.  Heck, I’ll even have my PHP determine which lines of Javascript get sent to the browser.  Imagine!  PHP writing Javascript?  Insanity.  But I figure why send the browser Javascript that isn’t needed for that particular page?  Strange concept, I know.

But yet, with all the ‘best practices’ telling you the necessities of keeping the presentation and the logic separate, I still can find code like this (not my code), embedded 1,200 lines deep inside a ZEND framework CONTROLLER file (shouldn’t have any VIEW parts in it, right?):

html_in_a_zend_controller

I’m sorry, but did I miss something?  I thought that the MVC framework was supposed to keep the HTML (view) completely separate from the logic (controller)?  So here we are, both me (the heretic) and the ‘best-practices’ coders sleeping in the same bed – presentation and logic in the same place.  How would you like to maintain an MVC project where a certain anchor tag is 1,200 lines down inside an MVC Controller file?  And the rest of the page was composed from thirty different other files?

Not me.  I find it simpler and more effective to just write straight HTML where I need to and put the dynamic logic right where it’s needed.  It’s light on it’s feet, it’s not too difficult to maintain, and it runs fast.

The results

The proof of this is easily seen by taking a look at ‘top’ (the linux tool to show you which processes are using what amount of resources, at one second intervals).  This is my code running on a server, servicing four requests in one particular second:

my server top

As you can see, the cpu is completely yawning along at 1.7% usage.  Each individual http request is taking between 0.3 and 0.7 percent of the cpu.  And let me assure you these aren’t static pages.  They’re fully dynamic and serving custom content to each user based on where they are in the app.

Now, for comparison, here’s an older version of this same app written in Java (not my code), also servicing four requests in a given second:

old server top

As you can see, Java is taking up 162.6% of one cpu, just to service these four requests.  Java is of course the ultimate in Object Oriented frameworks.  Which server do you think is going to have an easier time when twenty requests come in one second?

Here’s another application, also written in Java.  This one is chewing up 88% of four CPUs to service five incoming requests.  Five requests!  What happens when there’s seven?  You guessed it; serious lag.

java-top

To repeat, I don’t believe Object Oriented code belongs on a server.  This is why.

A shift in paradigm to the rescue

Okay, you’ve been taught you haaaave to have an MVC framework.  Great.  Move it to the client side, such as has been happening recently.  It’s actually a pretty good move.

The move to client-side frameworks is completely correct.  Packing all the CPU-intensive OO and MVC code into javascript and having the user’s machine do all the work is the right way to go.  JQuery, Angular, and other frameworks built off of Javascript takes the burden of compiling the page off of the server, and puts it onto the client computer.  This leaves the server doing very little calculating and lets it just serve out the files – let the client’s machine do the work of figuring out what the page is supposed to look like and contain.  That’s a beautiful arrangement and I’m happy to see the moves in this direction.  It won’t save you from your other crimes of sending out 2MB sized images – your bloated content is a different topic, but at least you’re not overloading your server by making it recalculate your navigation ten million times a day.

But for god’s sake – please let’s forget about server side frameworks.  The future of the web belongs on the client side; let’s get there sooner rather than later.

FPDF and the Check Mark

For years I’ve been creating PDFs with the lightweight and easy to control FPDF library.  Occassionally I’ve been called on to create checkboxes, and have always fallen back to just using an “X” to indicate the checked-ness of the box.

The reason for this was that I couldn’t find any workable solution for creating the checks.  It’s apparently searched for often enough, but there are no good solutions, other than falling back to an image file; which I think is overkill for this need.

I had a client recently ask me if he can have actual checkmarks in the PDF instead of the X’s, and I told him no, there was no good way to do this.  I told him I’d already researched it fruitlessly several times with no good answer.

This rubbed me the wrong way, having to tell a client ‘No,’ so I did some further digging and found the solution is not only simple, it’s been sitting in front of us since the beginning.

Installed by default in the FPDF libraray are five fonts:

  • Courier
  • Helvetica (Arial)
  • Times
  • Symbol
  • ZapfDingbats

My personal disdain for symbol-type fonts kept me from looking at the last two until I determined to set out to solve this checkmark problem.  All it took was looking at a map of the character set to see how easy this should be:

dingbats

And it was just that easy.  Here’s the final code to produce the checkmark in a box:

$pdf->SetX($left_margin);
if($boolean_variable == true)
$check = "4"; else $check = "";
$pdf->SetFont('ZapfDingbats','', 10);
$pdf->Cell($checkbox_size, $checkbox_size, $check, 1, 0);

Setting $check to the string “4” in ZapfDingbats (may the lords of decency have mercy on my soul) gives the checkbox right in the PDF without resorting to an image file:

checkbox-in-pdf

Problem solved.  We now have Checkmarks in a PDF, using FPDF without any headaches.

PEAR Spreadsheet_Excel_Writer errors on a modern server (PHP5.6)

I’ve been using the PEAR library Spreadsheet_Excel_Writer for years.  It’s a relatively easy way to generate .xls files in the native format, but it hasn’t been maintained for a while now.  So I’ve got some older code running this library, and want to keep it running.

I just recently upgraded to a newer server, which involved moving from CentOS 5 to 7, and from PHP 5.4 to 5.6.  I’m not sure which of those changes broke my old code, since they were simultaneous, but previously working downloads of spreadsheets now started to give an error reading:

We found a problem with come content in ‘filename.xls’. Do you want us to try to recover as much as we can? …

error_in_content

And, of course, clicking Yes doesn’t do anything useful to allow the file to be opened.

I didn’t want to have to rewrite the old code to use a newer library, like PHPExcel.  This specific error message lead me to a solution that took someone else two days to figure out.  This was a lifesaver.

This assumes that you have control over your server, as the change is made outside of the /home/account/ directory that you’re given under a CPanel server.  If you don’t have control over your server, you’ll have to send your host to this page for instructions.

The fix is very simple.  Change line 623 of

/usr/local/lib/php/OLE/PPS/Root.php

from

fwrite($FILE, pack(“V”, 1));

to

fwrite($FILE, pack(“V”, $num_sb_blocks));

and now your old Spreadsheet Excel Writer code can keep on working, without involving a rewrite of your code.

PHPExcel vs an Excel file with a million rows

Okay, this one is something I ran into today, and I wanted to document it’s occurrence, so I can refer back to the issue later; and thought maybe this could help someone else too.

PHPExcel is an excellent library for manipulating Excel files in PHP.  I started using this last year for a project where I needed a bit more than the old PEAR library offered.

(A few years back I would have just used the PEAR library – Spreadsheet Excel Reader/Spreadsheet Excel Writer, but it doesn’t seem these are currently being maintained, and some pretty gnarly hacks had to be done on a new server running the Writer library just to keep some older code functioning.  That’ll be the subject of another post.)

While PHPExcel is great at sophisticated manipulation of Excel files; it does have its demons.  One of the main issues being its memory usage. Not really a big deal until you get into files with tens of thousands of rows.  Then you can be in trouble.

But I ran into a very weird situation trying to upload a very small file and process it.

The server I was using had 8GB of RAM and a recent processor, so it certainly **should have** been able to handle a measly upload of FIVE ROWS of pertinent data (with another five of preliminary data).  But no.  I was getting the dreaded 30 seconds of nothing, followed by the famous

Fatal error: Maximum execution time of 30 seconds exceeded in

This I’ve seen on a file with 60,000 long rows of data, but on a paltry TEN rows of total data?  Nonsense.

Turns out that this spreadsheet was different tho.  This spreadsheet was sneaky.  The file had those ten rows of data (only five of which I cared about), but after scratching my head, and trying

ini_set(max_excution_time, 300);
ini_set(memory_limi , -1);

and other things for a while, I finally just looked at the file and noticed something odd.  The scrollbar’s ‘grab point’ was small.  Way too small for ten lines of data.

small_scrollbar

Grabbing it and scrolling all the way down took me to row 1048576.  This spreadsheet had 1,048,576 rows in it.

last_row

Turns out they all do, actually.  But for some reason, Excel considered the entire thing ‘Active.’  I tried deleting the extraneous rows.  No luck.  In testing what Excel considered to be the ‘last cell with data’ it was going to column AC on row 1,048,575, one row short of the limit. (To do this, press Ctrl+End in Excel)

So the file was being considered to have over a million active rows, times 20+ columns, and each cell in PHPExcel can consume around 1kb of memory; I was choked.

So I tried telling PHPExcel to just give me a smaller range of cells.  Instead of calling:

$rows = $excel->getActiveSheet()->toArray(null,true,true,true);

I tried

$rows = $excel->getActiveSheet()->rangeToArray(‘A1:W10’);

Still no luck.  Still running out of memory.  Not PHP’s memory limit, mind you, but the server’s memory limit.  The previous call to

$excel = $reader->load($file);

took in the whole thing, even if the later call referenced only the few rows I’m interested in.

The solution was blindingly simple.  Copy the five rows of data I’m interested in, into a blank spreadsheet (with the same five preliminary rows, but WITHOUT a million active rows).  Uploading that and processing it gave the usual results.