Category Archives: Articles

Articles I Write

Continuous Integration Engine

I have been working with our brand new continuous integration engine for about 1 year now and I have to say it is the best thing that happened to our programming team.

For those that don’t know what continuous integration is the definition from wikipedia:

Continuous integration (CI) is the practice, in software engineering, of merging all developer working copies with a shared mainline several times a day.

There are many implementations of CI in the world.  We work mostly with PHP, Javascript and MySQL and chose to glue it all together with a free application called Jenkins.  Jenkins allows us to automate all of the release, build and testing tasks that we would otherwise have to remember and coordinate across our team.


Our solution uses subversion to store our code in a repository.  There are many other solutions out there and you can read about my reasons to choose subversion here. So whenever we check something into our repository we have it configured to notify Jenkins.  Jenkins then kicks off a job that checks out a build copy and stands it up in our development website.  The whole process takes a couple of seconds.

The system is designed to use unit testing (J-J-J-J-J UNIT!) to validate the changes and with confidence produce a staging release copy.  The staging release copy is filtered in a special way to include only the relevant files in the release as there are a number of files we don’t want from our development site like the database configuration and theme files for example.

If everything checks out the system releases to our staging environment automatically.  It merges the release from earlier with some other files required to make the staging site unique.  We run through a series of tests and again if everything checks out it is released into production.



Diagram of a common SVN/Jenkins build process.

One of the biggest challenges we had with this system was handling the code changes combined with the database changes.  Our DB in development, staging and production are all different.  In our case we take a replicated copy of the production database and restore it into staging and run some modification scripts against it to simulate the same thing that will happen in the production release.  Perhaps a future post will elaborate on how this is achieved.  At any rate we are happily collaborating and releasing into production without breaking things and stepping on toes; this is a major advancement for our team as we have been doing it the other way for so long.


How to Make a Virtual Machine (VM)

I’m going to show you how to create a windows 8 computer with 3 easy steps.

  1. Create
  2. Edit
  3. Install

A virtual machine is basically a computer that is inside of a computer. There are many intrinsic benefits from running a VM such as portability as there is usually only 1 or two files and it can run on many different kinds of computer such as Apple, Linux and Windows.  I intend on keeping my VM on a USB so I can bring it with me and not carry a giant laptop.

There are many options when it comes to virtual computers.  They have been around for years and constitute a very mature technology now.  I chose to use VirtualBox from Oracle.  There is a free license available and offers a lot of features such as snapshots and a full hypervisor that runs on Windows, OSX, Linux and Solaris. You can get it from Download and Install, you know the drill. Once it is installed you can fire it up and create all sorts of VMs.  I was even able to create an android phone on an airbook.

oracle VM VirtualBox

The last VM I created was OSX.  I configured it to run full seamlessly in full screen and works quite well for my testing and to run objective-C on windows. For the purposes of this tutorial I have opted to create a windows 8 VM since I already have an apple one and I run my droid VMs inside of Eclipse instead of standalone VMs for developing convenience.



Step 1 – Create a New VM. I clicked new, chose a title, 2GB of RAM and chose to create a new hard drive [25GB] that was the default type of “VDI” but you could select many other options such as VMWare (VMDK) or Microsoft (VHD).  I prefer VDI since that is the hypervisor I am running in this case. For everything else I chose the defaults such as dynamic allocation and finished the wizard.

step 1 - name

Step 2 – I edited my newly created VM and added network support [default settings].  I want to be able to activate windows and eventually host a suite of tools. I went to the storage section and mounted the windows 8 ISO to the virtual CD/DVD drive.  If you have the actual DVD, you can easily mount it to your parent computers drive.

step 2 - mount ISO

Step 3 – Install Windows.  Start the VM on and install windows.

step 3 - install

That’s it, it wasn’t that hard was it?


MySQL Database Maintenance

Lately I have been working a lot with mySQL.  I’m not doing anything like Facebook does with their 9000+ MySQL instances but I am dealing with some fairly steady database load issues that require proper attention when scaling up the underlying website.

The server itself is running on some fairly decent hardware with the main databases using SSD drives for primary storage.  Of course, I don’t fully trust cellular storage yet for databases and a backup plan is in order.  Since this particular database is using the MYISAM engine it drastically affects the options that I have for locking the database.

Performing a mysqldump on this database will effectively lock every table and cause live transactions to queue up while the dump is occurring   On a dB that is 74GB and growing fast I can’t just dump the data and freeze up the system while everybody is using it.  For this reason I decided to do some automated table archiving before doing the dumps.

1. Archive old data using a script (MYSQL w/ PHP)

2. Optimize tables if they need it

3. MySQL Hotcopy the database and flush logs (MySQL w/ bash)

4. Dump the copied database using a script every hour to tier 2 storage

5. Backup the dumps every 4 hours to tier 3 storage/backups

In this case there is a lot of historical transaction data, statistics and messaging history that is really not referenced much after its time has come and gone.  We still want to keep it but we don’t really need it consuming resources in the main production database.  I decided we should create a logging database and send over the data that was older than some arbitrary time period.  I achieved this by declaring a new database and used the following SQL command for each table:

CREATE TABLE IF NOT EXISTS log_database.stats_table LIKE prod_database.stats_table;

The result is an identical schema copy of the source database. Sweet! Now on to the next, step data movement.  This was achieved with another SQL statement:

SELECT stat_id FROM prod_database.stats_table WHERE stat_added < (NOW() – INTERVAL 1

The result of the above statement will give me the key in that table that is older by about an 1 hour.

So I have my key and I have my target table… now to move the data across; I will do the old heave ho using the REPLACE command.  Its the best choice here because I might end up in a situation where a duplicate key exists in the target database.  In this case I want the same numbering system and I would like to update the data and the REPLACE command is perfect for that; definitely better that SELECT INTO.

I will move my data like this:

REPLACE INTO log_database.stats_table SELECT * FROM prod_database.stats_table WHERE prod_database.stats_table.stat_id < MY_KEY_FROM_ABOVE

DELETE FROM prod_database.stats_table WHERE prod_database.stats_table.stat_id < MY_KEY_FROM_ABOVE

After deleting all that data I should optimize it and clean up the fragmentation.

OPTIMIZE TABLE prod_database.stats_table

I took the above statements and wrapped it all in a PHP script.  I can use a PHP script to run as a webpage or standalone as a command line script.  I used the web server for convenience to graphically see what I was doing while capturing all of the output into a single file that could be displayed (web mode) or dumped to a file (shell mode).

Using the method above I was able to take the database from 74GB down to about 7GB.  Thats a substantial memory savings. Happy with my progress I wrote the database rotation script.  The script locates all of the necessary tools including the credentials, mysql, gzip, mysqldump, grep, cut, rm, date and cat as well as an configuration file containing a list of the databases I want to dump.  The script cleans up any old backups (older than a day) and gets rid of them.  I strongly recommend that you use absolute paths in any delete scripts.  In this case I am recursively deleting folders with date patterns… I took my testing very seriously.

The script was a classic #!/bin/sh shell script and works great.  It gets rid of old copies and dumps each database while flushing the logs on command; it even has an option to detect and skip specified tables.  So I decided to try it on my test system and watched it work nicely.  It takes about 20 seconds to dump the test system but alas the test system is way smaller than production system which has 5GB more data.  Quick math showed my that I was in for a 60 second production outage with the database’s current size… and it is growing.  To be effective my maintenance program needs to run frequently.

In steps mysqlhotcopy which is a perl utility that does a hot backup of a database with minimal downtime.  It doesn’t rely on replication or anything fancy but rather is a replication mechanism.  It will still lock the database but only briefly; long enough to lock the tables and flush the logs.  It makes sense to flush when the hotcopy is complete.  From here I take the static hot copy and use my dump script on it.  Works great and no impact to the users.  I have enough space to run a hot copy until the dB reaches about 90GB and then I should revisit my disks.

Everything is in place and working.  I used cron to run my tasks and got the data center guys to ensure a good backup of the files is done on a frequent schedule.  I record the event and email myself when it is done.

I recommend that everybody does a disaster recovery test after they implement a system like this.  While you are doing the disaster recovery make note of the critical steps and keep them in an accessible place. You never know when disaster will strike next or what shape it will be in but you can rest assured that have ensured minimal data loss and minimal downtime. In my spare time I intend on wrapping these procedures and creating a service to constantly manage, backup and report the database.

For all the critics out there that think I should just use a binary backup; you are correct. I also do have a binary backup but I don’t trust it completely.  It seems to do the job fairly well until it doesn’t… and I have a less than one hour proper backup to rely on.



Android Super Sync

I recently acquired a Samsung Galaxy SII LTE phone which has been nice so far. I have a few desktops, a galaxy tablet, some laptops, a blackberry and about 20 email accounts to deal with. I thought it was about time to start consolidating my information for my new phone.

First things first, a gmail account is required to use the Android Market application which I was going to need so that I could download all of the common apps that I use (ebay, twitter, wordpress etc…). So I set up my gmail account and set the device to use the brand new account.

I was playing with it and noticed that it didn’t have any contacts in it; I installed the facebook application for android and commanded the device to synchronize all of my contacts on the phone (none) with my facebook contacts which includes several hundred people. After about 30 seconds I had all of my facebook friends on my phone with pictures and some of them put their email and phone number in there so that was a good start.

Next I wanted to grab my business contacts. I installed the linked in application and synchronized all of my contacts with linked in. In another 30 seconds hundreds of business contacts appeared. Any that had similar properties automatically joined together and I started to notice that the contacts were getting nicely filled out.

The big one for me though is my blackberry device. I have thousands of contacts on there that are synchronized with an exchange server [blackberry enterprise] (not vcard as many home users have). I definitely didn’t want to manually enter them all so I did it in a way cooler way. I installed the google sync application on my BB by pointing the browser to — the application installed painlessly and it prompted me for my gmail account that I set up at the beginning. I put it in there and synchronized my BB up to the cloud. Then I went to contacts on my galaxy and instructed the phone to sync with google. About 30 seconds later I had my 1000 phone contacts on the phone which also magically merged with the existing contacts.  Subsequently the calendar synchronized as well so that I won’t miss that dentist appointment.

At this point I am excited because not only do I have all of my contacts on there I also have pictures for everybody and the option to contact them by any of the methods that I have installed.

Since then I have also installed whats app and skype and did something similar. Everything magically glued together into one dynamic contact list. Some of the contact objects are so well defined I have their job titles, email addresses, all of their phone numbers and in some cases even their home address and business address.

I had to join unmatched contacts manually which was only about 50 out of 2000 contacts so I didn’t mind doing that; it also gave me a chance to do some cleanup in there.  The sweet thing is that if I connect with somebody through any platform it will become instantly available as a contact on my smart phone.  If my phone explodes or falls in the ocean I can just replace it and sync everything with the google cloud and my contacts will be instantly restored.

I feel so connected.


10 Tips: How To Hire A Great Programmer

You have the next great technology idea and you need a programmer but don’t have a programming background.  What should you do?  The ideal candidate will be able to produce exactly what you are looking for, on time, on budget and keep your application supported in the long run. A lot of start up companies look at hiring in house or even outsourcing over seas instead of approaching a software development company.

There are many reasons why the overseas option is not desirable.  Usually overseas developers come with a plethora of complications.  Complications such as a the language and time zone barrier are at the top of the list.  Usually these companies disappear before the project is completed or immediately afterwards.  If the costs were kept really low during the project the savings are quickly burned up when trying to support an application with no documentation and no programmer.  The worst case is when the programmers that you hire take your project and use it or resell it to others to maximize their ROI.  In my opinion you want to stay away from outsourcing projects over seas.

The next approach is to hire an in-house programmer.  There are many advantages to having a programming team in-house.  Some of the advantages are that you directly control the project as it unfurls and have an accountable employee in the office that can respond to challenges as they arise.  Disadvantages include the increased cost of having to support another FTE (or team); the cost of a full time employee is usually double the cost of hiring a software development company.  Also, it is often difficult to hire good, experienced programmers in house that can produce the results your desire at the rate that you are willing to pay.

I have seen many startup companies requesting programmers to work on their project pro bono for a percentage of the company or a hope of future profits.  In my opinion that is just ugly and as a professional programmer I would much rather get paid for my services as I am producing results.  The best programmers are quite picky when it comes to their projects and usually tend to stick to the development environment that is the most comfortable for them.  The bottom line is that the best people to do the job in an efficient manner usually cost a little bit more and you will get mixed results at the end of the project depending on how much they stand behind their code.

It is important to establish a good working relationship with your programmers so that you get all of the bases covered with the least amount of impact to your company.  One approach that has worked well for us in the past is to receive a direct percentage of the (long term) company profits to develop at a reduced rate and in return fully support the software product when it is completed. Another approach is to provide excellent documentation so that a third party developer can step in, if needed, and rapidly remedy your situation. Regardless of the approach that you choose, these tips should help you find the right candidate.

#1 – Ask for the programmers approach

When you are selecting your programmer candidate you should always ask them how they will solve your particular problem.  Find out what technology will they use and why. This will help you understand how the project will be structured, what sorts of external costs will be required and give you an understanding of how long it will take.

#2 – Develop a support plan at the beginning

Before going down the path of development it is important to understand what sort of ongoing effort will be required to support the application. Find out how far the candidate will stand behind their service and above all make sure you know how things will be fixed when they go wrong.

#3 – Review their portfolio/resume

Give yourself an understanding of what your final product will look like and how it will feel. It is important to see how much experience they are bringing to the table which ultimately will decide how much value the candidate has.

#4 – Don’t get stuck on the design

When reviewing the portfolio, you may notice that the design isn’t what you would prefer.  You have to remember that you are looking for a programmer and not necessarily a designer.  Often you can find a candidate that is proficient in both areas but my experience has been that the best programmers are horrible designers and vice versa.  What you should be evaluating is how the code works.  Look to see if it crashes or how well it scales when under a large load. If they accomplished the task at hand successfully, then it is likely that they will accomplish your task as well.

#5 – Compare their rates to others

Like any sort of service, get quotes.  Before hiring somebody, find out how much it will cost to have the project completed with support from at least 3 different sources.  It can be a time consuming task but it is well worth it and you can use these tips during the evaluation process.

#6 – Know what you want

It can be incredibly frustrating for a programmer to try and guess exactly what their client wants.  Most clients have an idea of what they would like to accomplish but haven’t really thought about all of the complexities behind the idea.  For example your project may incorporate some sort of shipping model.  I choose shipping because it is often one of the most complex components of any eCommerce model.  Many factors come into play such as flat/calculated rates, different carriers, tracking codes, order cancellation, country restrictions, package sizes, multiple items within an order etc.  The bottom line is that you should truly understand what it is you expect to happen when the end user presses that button.  Knowing what you want will allow you to truly convey your project to your programmer and keep your costs down.

#7 – Get it in writing

Before proceeding with a project; both parties need to come to an agreement on how they will deal with payments and a clear understanding of what is included and what is not included.  Consider outlining your project even if you are looking at hiring a full time employee so that you and your candidate have a good understanding of what the project milestones are.

#8 – Ask for a demo

Most programmers will be able to demonstrate software that they have already written. This will give you an idea of what to expect when making your final decision.

#9 – Determine other areas of expertise

Find out if your programmer has other skills and services that could prove useful.  Perhaps your project requires hosting or computer support.  The programmer, or team, can provide value in areas that you didn’t anticipate.  Hey, it doesn’t hurt to ask.

#10 – Have a budget before you start

Before you set out into the world looking for quotes and programmers; make sure you know how your project will be funded.  You don’t have to disclose your budget to your candidate but it is important to know so that you can set your expectations accordingly.

I know it goes without saying, but it would be a bad move to not knock on the door of the 1MB Corporation [].  It is, after all, the best way to ensure your project gets done on time and without hidden surprises.


Proxy Chaining

Proxy chaining is the act of linking several network proxies together to access a network address.  I’ll be explaining what a proxy is, how chaining works and a bit about what they are used for.

A proxy server is actually a service that can relay http/ftp/scp [etc] traffic.  There are actually many different kinds of proxies which are usually just referring to features or settings for that proxy.  Some proxy servers, usually in large organizations, might provide filtered, cached, or monitored traffic.  Proxies speed up access to common services by looking at common traffic and caching the results of the queries.  Imagine a client demands a query like and that thousands of people in the 1MB corporation load that page every day.  A proxy could step in and only request the homepage one time instead of 1000 hits on the same page.  The proxy is thus freeing up the organizations’ internet connection to do other useful things like google-ing and watching youtube videos.

Fig 1 – Internet 1997

So proxies are great right?  Actually they can be used for [many] other things as well.  There is a concept of a closed proxy, open proxy, secure proxy and a transparent proxy.  A closed proxy would be a server that you authenticate to; one that requires credentials to access.  Closed proxies tend to be faster and more secure that open proxies due to typically being ‘membership only.’  An open proxy, therefore, is one that you can use for free without authenticating.  Do to their nature; one could also distribute incorrect results across a proxy… more about that later…  It is often hard to tell if a proxy is behaving, I usually just ask for a page with a number on it and check to see if it returns the number.  A secure proxy is a term that describes a proxy that actually tries to hide your identity.  The proxy can do this because it has an IP address on the other side of it and goes off and delivers your content on your behalf.  This means to a web server like google, it might look like you are in San Francisco when you are in fact in Nunavut or Japan.  You can probably understand how this would be useful for many reasons; the most common being just like in the movies when the FBI is ‘tracing’ the bad guy on the internet and basically figuring out their location.  A transparent proxy is term used when a proxy shows information about you on the other side; the side you are hiding from.  When testing for a proxy you should probably check to see if it returns an anonymous IP; but also the agent, and other header information.  If a proxy is fully transparent, it reveals all information about your client to the server that you are requesting from and doesn’t provide the anonymous features.

So who cares right?  Well some people care a lot; Computer savvy folks but also people who are living in countries with various levels of filtered internet.  You see, each country is connected and proxies are used to block, filter, speed up or even change it so that it says something else.  Some people honestly think governments don’t filter internet connections. Its amazing to me that entire countries have their Internet so filtered that they can’t research their information and make their own opinion.  It seems you are getting to this site right now; so good for you… your country lets you read me.

Fig 2 – The Internet Under the Sea [I forget where I saw this]

Proxy chaining can also be used to watch movies and TV.  Just grab a proxy and make it look like you are in New York and you can watch streaming channels and view your ebay account as if you are an Australian customer.  A quick google search will provide you with hundreds of proxy servers offering anonymous, secure access.  VPNs  are also useful when combined with a proxy as it allows a person to establish a secure connection with the remote IP and then the people in the middle can’t see what you are up to.

So how do you do it?  You can use a program, or your can write your own.  If you have Internet Exploder, you can type it in to the proxy servers and ports separating the proxies with semi-colons (;).  If you are really crafty, you might even write an app that can swap the proxies around while you are using them and check them to see if they are still working properly.  Are you using a proxy now?  I don’t know, you tell me?



Did you read that right?  AZMAN?  Yep… from pretty much any windows box you can click start-run and type azman.msc and click okay…

What you get is Authorization Manager.  Evidently this tool has been around for quite some time and was intended to be used to facilitate proper RBAC models (Role Based Access Control).  I was shown it today by one of my collegues and I can see that it would be quite useful. 

It lets you store the data in 1 of 3 locations:

  1. Active Directory
  2. XML file
  3. MSSQL Database

Interestingly enough all you have to do is put in the data store and then select “Action – New Application.” At that point you will be presented with your basic application settings as follows:

So far so good right?At this point you can start building your role based access model.  You can create operation definitions (which basically allow very granular access to the resources you specify).  Likewise, you can create a task or role definition based which can be a combination of operation definitions in any form you like. 

Once you have it all layed out, you can connect to the AZMAN with a .net call to call AccessCheck and see if the application credentials are allowed based on what you defined in AZMAN!  Will you use this application; who knows… I would love to see some valid implementations and see how it works. 

Here is an article discussing how to do this in ASP.NET:

Happy azmanning!! haha,