PHPExcel vs an Excel file with a million rows

Okay, this one is something I ran into today, and I wanted to document it’s occurrence, so I can refer back to the issue later; and thought maybe this could help someone else too.

PHPExcel is an excellent library for manipulating Excel files in PHP.  I started using this last year for a project where I needed a bit more than the old PEAR library offered.

(A few years back I would have just used the PEAR library – Spreadsheet Excel Reader/Spreadsheet Excel Writer, but it doesn’t seem these are currently being maintained, and some pretty gnarly hacks had to be done on a new server running the Writer library just to keep some older code functioning.  That’ll be the subject of another post.)

While PHPExcel is great at sophisticated manipulation of Excel files; it does have its demons.  One of the main issues being its memory usage. Not really a big deal until you get into files with tens of thousands of rows.  Then you can be in trouble.

But I ran into a very weird situation trying to upload a very small file and process it.

The server I was using had 8GB of RAM and a recent processor, so it certainly **should have** been able to handle a measly upload of FIVE ROWS of pertinent data (with another five of preliminary data).  But no.  I was getting the dreaded 30 seconds of nothing, followed by the famous

Fatal error: Maximum execution time of 30 seconds exceeded in

This I’ve seen on a file with 60,000 long rows of data, but on a paltry TEN rows of total data?  Nonsense.

Turns out that this spreadsheet was different tho.  This spreadsheet was sneaky.  The file had those ten rows of data (only five of which I cared about), but after scratching my head, and trying

ini_set(max_excution_time, 300);
ini_set(memory_limi , -1);

and other things for a while, I finally just looked at the file and noticed something odd.  The scrollbar’s ‘grab point’ was small.  Way too small for ten lines of data.

small_scrollbar

Grabbing it and scrolling all the way down took me to row 1048576.  This spreadsheet had 1,048,576 rows in it.

last_row

Turns out they all do, actually.  But for some reason, Excel considered the entire thing ‘Active.’  I tried deleting the extraneous rows.  No luck.  In testing what Excel considered to be the ‘last cell with data’ it was going to column AC on row 1,048,575, one row short of the limit. (To do this, press Ctrl+End in Excel)

So the file was being considered to have over a million active rows, times 20+ columns, and each cell in PHPExcel can consume around 1kb of memory; I was choked.

So I tried telling PHPExcel to just give me a smaller range of cells.  Instead of calling:

$rows = $excel->getActiveSheet()->toArray(null,true,true,true);

I tried

$rows = $excel->getActiveSheet()->rangeToArray(‘A1:W10’);

Still no luck.  Still running out of memory.  Not PHP’s memory limit, mind you, but the server’s memory limit.  The previous call to

$excel = $reader->load($file);

took in the whole thing, even if the later call referenced only the few rows I’m interested in.

The solution was blindingly simple.  Copy the five rows of data I’m interested in, into a blank spreadsheet (with the same five preliminary rows, but WITHOUT a million active rows).  Uploading that and processing it gave the usual results.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>