January 28, 2016

How to reduce AWS Windows server creation time



So I've been working with AWS cloud to host some Windows servers. Embracing the concept "infrastructure as code", I coded the server set up as AWS CloudFormation template. As part of a server creation, some custom scripts might need to be run to initialise the server. Those scripts can be specified in an AWS::CloudFormation::Init block. A snippet of that is below:
    "WindowsInstance": {
      "Metadata": {
        "AWS::CloudFormation::Init": {
          "config": {
            "commands": {
              "00-configEnvAndCopyScripts": {
                "command": "powershell.exe -ExecutionPolicy Unrestricted c:\\cfn\\scripts\\config_env_and_copy_scripts.ps1"
              },
              "01-settingTime": {
                "command": "powershell.exe -ExecutionPolicy Unrestricted c:\\cfn\\scripts\\set_time.ps1"
              },
              ...

All well and good so far. But I can tell you how much I hate Windows servers... In AWS cloud, it could take up to 30 minutes to spin up a Windows server. Luckily AWS does output the log of the initialisation process in the file cfn-init.log so that we can trace through what is going on. The output sample is something like this:
2016-01-14 05:08:50,957 [DEBUG] CloudFormation client initialized with endpoint https://cloudformation.ap-southeast-2.amazonaws.com
2016-01-14 05:08:50,957 [DEBUG] Describing resource WindowsInstance in stack arn:aws:cloudformation:ap-southeast-2:4623423112354:stack/tableau-a/abcdef-123456
2016-01-14 05:08:51,535 [INFO] -----------------------Starting build-----------------------
2016-01-14 05:08:51,737 [DEBUG] Creating Scheduled Task for cfn-init resume
2016-01-14 05:08:51,816 [DEBUG] Scheduled Task created
2016-01-14 05:08:51,832 [INFO] Running configSets: default
2016-01-14 05:08:51,832 [INFO] Running configSet default
2016-01-14 05:08:51,832 [INFO] Running config config
2016-01-14 05:08:51,832 [DEBUG] No packages specified
2016-01-14 05:08:51,832 [DEBUG] No groups specified
2016-01-14 05:08:51,832 [DEBUG] No users specified
2016-01-14 05:08:51,832 [DEBUG] No sources specified
2016-01-14 05:08:51,878 [DEBUG] Running command 00-configEnvAndCopyScripts
2016-01-14 05:08:51,878 [DEBUG] No test for command 00-configEnvAndCopyScripts
2016-01-14 05:10:47,171 [INFO] Command 00-configEnvAndCopyScripts succeeded
2016-01-14 05:10:47,171 [DEBUG] Command 00-configEnvAndCopyScripts output: The operation completed successfully.
2016-01-14 05:10:47,173 [INFO] Waiting 60 seconds for reboot
2016-01-14 05:10:48,187 [DEBUG] Running command 01-settingTime
2016-01-14 05:10:48,187 [DEBUG] No test for command 01-settingTime
2016-01-14 05:10:48,437 [INFO] Command 01-settingTime succeeded
2016-01-14 05:10:48,437 [DEBUG] Command 01-settingTime output: The operation completed successfully.
2016-01-14 05:10:48,437 [INFO] Waiting 60 seconds for reboot
Looking at the output, I was curious about the line Waiting 60 seconds for reboot, what was it doing? Why does it wait for 60 seconds? I searched around and finally got an answer: by default, every command in AWS will wait for 60 seconds in case the command causes a restart. But if you are sure that your command does NOT cause a restart, you don't need to and shouldn't have to wait. In that case, you can put "waitAfterCompletion": "0"  after the command. It will save one minute per command. So if you have 60 commands, that is 60 minutes saved.

The new CloudFormation template looks like this:
      
"WindowsInstance": {
    "Metadata": {
        "AWS::CloudFormation::Init": {
          "config": {
            "commands": {
              "00-configEnvAndCopyScripts": {
                "command": "powershell.exe -ExecutionPolicy Unrestricted c:\\cfn\\scripts\\config_env_and_copy_scripts.ps1",
                "waitAfterCompletion": "0"
              },
              "01-settingTime": {
                "command": "powershell.exe -ExecutionPolicy Unrestricted c:\\cfn\\scripts\\set_time.ps1",
                "waitAfterCompletion": "0"
              },
              ...
Unfortunately you need to repeat that waitAfterCompletion block after every command. As far as I know, there is no way to set the default wait to 0 second. If you find a better way to achieve this, please leave a comment below. 

So you might say, "why don't you set up everything, then build an AMI from it, rather than doing initialisation every time?". Good question! I like to build everything from scratch as I know what goes into the server, and those changes are version controlled. Having an AMI is like a black box, you want to keep it to the minimum. If you lose that AMI, you might not know how to rebuild that AMI.

Thanks and good luck with your AWS adventure.