Posts

Deep-archive an aws S3 bucket with versioning enabled once

Image
It's not a surprise if you want to deep-achieve a S3 bucket, or some objects in it to decrease storage cost. But if versioning has once been enabled on the bucket, how to do the deep archiving needs more consideration.  Problem Statement If you have a S3 bucket, which versioning has never been enabled, then to move objects into Glacier Deep Archive can be achieved by iterating over every object, and  making a copy of an object using the  PUT Object - Copy  API.  You copy an object in the same bucket using the same key name and specify request headers, e.g. here you set  the   x-amz-storage-class   to the storage class,  ' DEEP_ARCHIVE',   that you want to use  [1]. Then the final effect is just like that the storage class of an object is being changed.  But if a bucket has once been versioning enabled, the above method is not straight forward. - If versioning is still enabled, even  you can copy an object with a spec...

Good practices of using Python logging

Problem Statement During the maintenance work of a legacy Python project, it's noticed some issues of logging code events. Outputs not in sequence The legacy code in some places still uses print() function, and traceback.print_exc() like below, try : #my logic goes here print ( "my output" ) except Exception : traceback . print_exc() By default in Python those functions are buffered. print() will send outputs to stdout, and traceback_print_exc() will send output to stderr. If those buffers are not flushed timely or disabled, the output may not in sequence which causes the outputs out of order.  Logging configuration not centralized and externalized It's noticed the logging configuration is added in every file using the Python logging. It means the logging configuration is not centralized and externalized for easy usage. Also when a message is logged, the 'logging' object is used directly instead of using a 'logger' object. The '...

Azure DevOps YAML pipelines - limitations with Github Enterprise Server

YAML pipelines are a new form of pipelines that have been introduced in Azure DevOps Server 2019 and in Azure Pipelines. YAML pipelines only work with certain version control systems .   In that support matrix, it's said 'Github Enterprise Server' is one of the supported version control systems of Azure pipelines (YAML). We have source code of our project on our company's github server, github.abc.com, and also wanted to have the pipeline definition file, the YAML file, to be version-controlled. When we gave it a try, it's noticed some limitations when using YAML pipelines with application source code on Github Enterprise Server. Limitation 1: Azure DevOps only integrates with the YAML pipeline file on Azure Repo or github, but not Github Enterprise Servers like github.abc.com.  This doesn't mean you can't put the pipeline definition file on your Github Enterprise Server for version control manually. Yes, you can. But what we want is to integrate that yaml f...

Azure ARM deployment - deleting resource group can't delete role assignments cleanly

Problem statement Once an ARM template is deployed into a resource group, one way to completely delete those deployed resources is to delete that resource group. But it seems roles assignments can't be cleaned up entirely. In our case, we have an ARM template to deploy Azure AKS cluster in a specified resource group, and two role assignments. One is to assign "Network Contributor" on VNET to the managed Id of the AKS cluster, and the other is to assign "Contributor" role on an already existing Azure Container Registry (ACR) to the managed Id of the AKS cluster. First we successfully deployed the ARM template. The AKS was setup, and those two role assignments were deployed. Then we deleted that resource group. All resources under that resource group, including the AKS cluster, VNET,  etc. were deleted successfully. The role assignment on VNET was cleaned up as well, but the role assignment on ACR was still there. Role assignments before and after resource group d...

Azure blueprint - how to handle with ARM parameter file?

Image
Problem statement When you deploy an ARM template, it's normal to provide a parameter file, in which you can put custom values for parameters used during the ARM template deployment. When you add an artifact of ARM deployment into a blueprint, unluckily you can't use a parameter file on Azure Portal - there is just no field on UI for you to specify a parameter file as shown on the below screenshot. But alternatively you can use a command line to add an artifact of ARM template deployment to a blueprint, which allows you to specify a parameter file! E.g. using the below PowerShell command to insert an ARM template deployment of a Log Analytics workspace. $bpDefinition = Get-AzBlueprint -SubscriptionId '<sub Id>' -Name '<blueprint name>' -Version '<blueprint version number>' New-AzBlueprintArtifact -Blueprint $bpDefinition -Type TemplateArtifact -Name 'la-workspace' -TemplateFile .\la-workspace-deploy.json -TemplateParame...

Auto-installing NVIDIA device plug-in on GPU nodes only

Problem Statement On a Kubernetes cluster before the GPUs in the nodes can be used, you need to deploy a DaemonSet for the NVIDIA device plugin. This DaemonSet runs a pod on each node to provide the required drivers for the GPUs. E.g. on Azure AKS cluster, here is the Azure official document regarding how to install NVIDIA device plug-in .  The YMAL manifest is in the nvidia-device-plugin-ds.yaml file. Once this file is applied, it does automatically run a pod to provide the required driver once a GPU node is added, no matter it's manually scaled or auto-scaled. But it may also run the pod on a node which has no GPUs at all, since that yaml manifest doesn't constraint the pods to run only on GPU nodes. This is a waste of resources on non-GPU nodes. Solution This is where " nodeSelector " and " affinity " can help. "nodeSelector" provides a very simple way to constrain pods to nodes with particular labels. The "affinity/anti-affinity" feat...