AZ-400: Designing and Implementing Microsoft DevOps Solutions – Preparation Notes

During my preparation to AZ-400 exam I generated/gathered bunch of notes, summarizing some specific topics and emphasizing important things. They are in completely random order.

These notes also have been posted on Twitter Thread.

P.S. I did pass AZ-500 on April 2021. Here is the link to my Credly account.

Used Resources

My Preparation Notes

Web application logs
For Windows apps, file system log files are stored in a virtual drive that is associated with your Web app. This drive is addressable as D:\Home, and includes a LogFiles folder; within this folder are one or more subfolders.

For Linux Web apps log messages are stored in Docker log files.

All Azure Web apps have an associated Source Control Management (SCM) service site. This site runs the Kudu service, and other Site Extensions; it is Kudu that manages deployment and troubleshooting for Azure Web Apps, including options for viewing and downloading log files.

Resource can only be a member of a single resource group.

Resource groups can’t be nested.

Resource groups for organization best practices:

  • Consistent naming convention

Tags are name/value pairs of text data that you can apply to resources and resource groups.

Tags allow you to associate custom details about your resource, in addition to the standard Azure properties a resource has the following properties:

  • department (like finance, marketing, and more)
  • environment (prod, test, dev)
  • cost center
  • life cycle and automation (like shutdown and startup of virtual machines)

A resource can have up to 50 tags.

The name is limited to 512 characters for all types of resources except storage accounts, which have a limit of 128 characters.

The tag value is limited to 256 characters for all types of resources.

Tags aren’t inherited from parent resources.

Not all resource types support tags, and tags can’t be applied to classic resources.

With the code in source control, App Center will build the app for iOS and Android and run integrated UI tests to ensure the app meets expectations.

The resulting app can be deployed automatically to public app stores, such as the Apple App Store or Google Play Store.

Creating an App Center account is free.

A common pattern is to externalize your state to another service, like Azure Cache for Redis or SQL Database, which makes your web servers stateless.

Traffic Manager is a DNS-based load balancer that you can use to distribute traffic within and across Azure regions.

When it comes to monitoring and analytics on Azure, we can bundle services into three specific areas of focus:

  • Core monitoring
  • Deep infrastructure monitoring
  • Deep application monitoring

You can use Azure DevTest Labs to deploy VMs with all of the correct tools and repositories that your developers need.

Testing should occur on both application code and infrastructure code, and they should both be subject to the same quality controls.

Azure Pipelines for automated testing and Azure Testing Plans for manual testing.

Azure App Service lets you enable Application Insights for an application without adding the SDK to your code. This feature, called runtime instrumentation, doesn’t offer deep insight into your app the way the SDK can.

Runtime instrumentation is also an Azure-specific feature and is available only for Windows-based web apps.

The best practice is to use a different instrumentation key and Application Insights resource for each environment in which your application runs to prevent unrelated telemetry from being grouped together.

It’s this realization – that systems are comprised of individual mechanisms, each of which we can understand – that makes monitoring both feasible and practical.

Log records events over time while metrics represent states of being or service levels, often at the present time.

Trace-maintenance platforms represent a category of performance-management tool that collect data about the low-level service calls between highly distributed services and functions, especially in containerized environments orchestrated by Kubernetes.

Until each facet of every stage of service delivery can be monitored, any optimum service level is a wild guess.

USE (Utilization, Saturation, Errors) to symbolize the most common correlation applied to evaluating the status of a solution.

The three components of USE are defined as follows:

  • Utilization – A level, often expressed as a percentage, representing the time over a given interval a resource is busy rather than idle
  • Saturation – A determination of how many requests the resource processed over the same interval, often coupled with a measurement of the size of the queue of unprocessed requests during that time
  • Errors – The number of incidents of unhandled exceptions and unfulfilled requests during the same period

The RED method focuses entirely on these three factors relating to request responsiveness:

  • Rate – The number of requests a service processes over a given interval (usually one second)
  • Errors – The number of failed requests in that same interval
  • Duration – The average time a service consumes in responding to a request before rendering a response

Remediation planning is the process by which you define how problems uncovered by monitoring are mitigated and resolved.

For enterprises that use it in performance monitoring, a KPI is a quantifiable value that represents some aspect of performance as it pertains to at least two of the following:

  • System health
  • Relative progress in meeting business objectives
  • End-user satisfaction with the system
  • Efficiency of the IT department in resolving issues

To that end, a process emerges for any organization to develop KPIs for its own internal use:

  • Gather disparate teams together. No one team should be delegated the task of setting priorities and objectives without consulting the other teams.
  • Collectively set business priorities. “Really fast Web page load times” is a perennial goal for information systems, but this will always be the case. What are the most pertinent goals of the organization, to which its information system plays a directly contributing role?
  • Quantify business objectives parameters. Determine which of these priorities can be verifiably monitored and measured digitally.
  • Establish business targets. Write the formulas for the relationships between measurable, observable factors, and the optimum values for those formulas, in a simple, mathematical way that all stakeholders can understand.
  • Integrate the ticketing system or whatever automated functions are involved in addressing problem issues, so that system improvement projects for business objectives purposes may co-exist with performance and software problems.
  • Dedicate the performance monitoring platform to the task of gathering the pertinent metrics that pertain to the established business targets. Where applicable, enable dashboards to continually report these metrics in simple, graphical forms.

Use Azure Event Grid to handle events within your systems.

Continuously monitoring your infrastructure helps you respond appropriately and more effectively to issues. It also helps you gain better insight, and you’ll learn from the issues that affect your infrastructure. You can strengthen your protection and build an improved infrastructure.

Azure Monitor is the service for collecting, combining, and analyzing data from different sources.

Security Center gives detailed analyses of different components of your environment. These components include data security, network security, identity and access, and application security.

You use Application Insights if:

  • You want to analyze and address issues and problems that affect your application’s health.
  • You want to improve your application’s development lifecycle.
  • You want to analyze users’ activities to help understand them better.

Availability tests allow you to check the health of your application from different geographic locations.

Application Insights works with Azure Pipelines. Use them together to improve your development lifecycle.When an Azure Pipelines release pipeline receives an alert from Application Insights that something went wrong, it can stop the deployment. It then rolls back the deployment until the issue that caused the alert is resolved.

A workspace can be used with multiple subscriptions. You can gather data from machines across multiple subscriptions, and analyze it together from one central location.

Without Security Center, you couldn’t identify and address risks and threats to the infrastructure.

Without Application Insights, you couldn’t analyze and address issues that affect the health of your applications.

Without Sentinel, you wouldn’t have a detailed overview of the security and health of the entire organization.

Without Monitor, you wouldn’t have a single solution to unify the various services – and query and analyze data from one place.

The Azure managed identity is a free feature that’s included with Azure Active Directory (Azure AD). You can use this feature to authenticate an identity to any Azure service that supports Azure AD.

User-assigned managed identity is created as a standalone Azure resource. It’s independent of any app. When user-assigned identity is provisioned, Azure creates a service principal just as it does for a system-assigned identity. You can assign it to more than one application.

With Azure AD B2B, you don’t take on the responsibility of managing and authenticating the credentials and identities of partners. Your partners can collaborate with you even if they don’t have an IT department. For example, you can collaborate with a contractor who only has a personal or business email address and no identity management solution managed by an IT department.

The managed identity feature solves the credential problem by granting an automatically managed identity. You use this service principal to authenticate to Azure services.

You enable system-assigned identity directly on an Azure service instance, such as a VM. When you enable that identity, Azure creates a service principal through Azure Resource Manager.

A resource can have only one system-assigned managed identity.

Azure key vault is a centralized cloud service for storing application secrets such as encryption keys, certificates, and server-side tokens. key vault helps you control your applications’ secrets by keeping them in a single central location and providing secure access, permissions control, and access logging.

Azure Key Vault helps safeguard cryptographic keys and secrets that cloud applications and services use. Key Vault streamlines the key management process and enables you to maintain control of keys that access and encrypt your data. Developers can create keys for development and testing in minutes, and then migrate them to production keys. Security administrators can grant (and revoke) permission to keys, as needed.

There is no support for anonymous access to a Key Vault.

Developers will only need Get and List permissions to a development-environment vault.

For apps, often only Get permissions are required as they will just need to retrieve secrets.

APIM is made up of the following components:

  • API gateway
  • Azure portal
  • Developer portal

The term security posture refers to cybersecurity policies and controls, as well as how well you can predict, prevent, and respond to security threats.

When you download sign-in log records, you’re limited to the most recent 250,000 records, based on the filter criteria that you’ve applied.

Azure Policy also integrates with Azure DevOps by applying any continuous integration and delivery pipeline policies that apply to the pre-deployment and post-deployment phases of your applications.

Azure Policy also includes initiatives that support regulatory compliance standards such as HIPAA and ISO 27001.

Instead of having to configure features like Azure Policy for each new subscription, with Azure Blueprints you can define a repeatable set of governance tools and standard Azure resources that your organization requires. In this way, development teams can rapidly build and deploy new environments with the knowledge that they’re building within organizational compliance with a set of built-in components that speed the development and deployment phases.

Once the incident has been resolved, it’s important to follow up and benefit from the experience.

You don’t work “on” or “with” a system; you work in the system.

Humans make mistakes. However, human error is not a cause; it’s a symptom. When human error is deemed to be the reason for a failure, people stop there instead of further analyzing the incident.

All incidents have one thing in common: they can provide valuable learning experiences.

The Application Dashboard link in Application Insights can be used to automatically generate a dashboard that has most of the key items that you’ll need as a starting point. Note that it doesn’t include Azure Service Health. You should pin this to your dashboard so you can check on whether the problem is with your systems or with the cloud service itself.

The Application Map in Application Insights can be used to drill into exactly what’s going on to cause the issues. You can follow the breadcrumbs to find the cause of the error (for example, a malformed URL).

Objectives & Key Results (OKRs) is a goal-setting framework designed to connect strategic goals set by leadership with the day-to-day activities of execution teams.

In Agile methodology, which uses Continuous Planning principles, time is fixed to meet business objectives. The only thing that is negotiable is scope.

There are six principles of Continuous Planning:

  1. Value simplicity
  2. The manifesto for agile software development
  3. Design thinking
  4. Iterative and incremental development
  5. Lean management
  6. Estimation accuracy

People who work in creative endeavors don’t need “beer in the break room” to motivate them. Creative people instead need mastery, autonomy, and purpose.

All improvement requires change, but not all change is improvement.

Measure impact, not activity!

Azure Advisor identifies unused or underutilized resources and recommends unused resources that you can remove.

Use autoscaling to dynamically adjust your compute resources based on the metrics you collect.

Synthetic transactions are predictable tests that enable you to compare results from release to release.

InnerSource is the practice of applying open source patterns to projects with a limited audience.

If your project has .github/ISSUE_TEMPLATE.md, anytime a user starts the process of creating an issue, they will see this content.

Making the build and test steps separate will make it easier to understand the log.

The best commit messages complete the sentence, “If you apply this commit, you will …”

A continuous integration (CI) build is a build that runs when you push a change to a branch.

A pull request (PR) build is a build that runs when you open a pull request or when you push additional changes to an existing pull request.

If your branch is for working on a new feature, you might use feature/<branch name>.

For a bug fix, you could use bugfix/<bug#>

A badge is part of Microsoft Azure Pipelines. It has methods you can use to add an SVG image that shows the status of the build on your GitHub repository.

It is important to note that DevOps and SRE are two different parallel attempts to address the same challenges. SRE is not the next evolutionary step after DevOps. SRE was not created to be “the future of DevOps.”

An error budget is the difference between the service’s potential perfect reliability and its desired reliability

Azure Monitor supports dimensions, which enable monitoring data to be supplied from multiple target instances.

Smart groups enable you to address a group of alerts instead of each alert individually.

Using smart groups can reduce alert noise by more than 90 percent.

Remove noisy alerts. Over-monitoring is a harder problem to solve than under-monitoring.

Classify the problem into one of these categories:

  • Availability and basic functionality.
  • Latency.
  • Correctness.
  • Feature-specific problems

Monitoring for your users is also called symptom-based monitoring.
In general, users care about:

  • Basic availability and correctness.
  • Latency
  • Completeness, freshness, and durability
  • Uptime
  • Features

Having a weekly review of all triggered on-call alerts and analyzing quarterly alert statistics can help you to see patterns that are lost when focusing on individual alerts.

Your alert setup is more complex than the problems they’re trying to detect.

Even if the load is predictable and steadily increasing as the popularity of the service increases, many cloud administrators choose to scale horizontally rather than vertically.

Well-designed applications should ideally use service APIs to query and discover resources and connect to them in a dynamic fashion.

Make sure that administrators cannot directly log in to a critical resource from the internet without visiting an internal launchpad.

A fault-tolerant system has the ability to perform its function even in the presence of failures in the system.

Idle resources should also be flagged and terminated (based on certain rules) by the monitoring system.

Commonly used tags specify the owner (user or group) of a particular resource, the environment to which it belongs (for example, production, backup, staging, and testing), the cost center in charge of paying the bill, etc.

Instead of trying to resolve all of the sources that inflate the latency tail, cloud applications must be designed to be tail tolerant.

A template enables you to define common build tasks one time and reuse those tasks multiple times.
You call a template from the parent pipeline as a build step. You can pass parameters into a template from the parent pipeline.

Unit test guidelines:

  • Don’t test for the sake of testing
  • Keep your tests short
  • Ensure that your tests are repeatable
  • Keep your tests focused
  • Choose the right granularity

One reason to create a package instead of duplicating code is to prevent drift.

Semantic Versioning is a popular versioning scheme. Here’s the format:
Major.Minor.Patch[-Suffix]

Continuous Delivery (CD) helps software teams deliver reliable software updates to their customers at a rapid cadence. CD also helps ensure that both customers and stakeholders have the latest features and fixes quickly.

Azure DevOps also provides information as an OData feed. Use this feed to publish reports and notifications to systems such as Power BI, Microsoft Teams, or Slack.

In YAML, you use the pipe (|) syntax to define a string that spans multiple lines.

Azure Stack Hub is a hybrid cloud platform that enables you to use Azure services from your company‘s or service provider’s datacenter.

Smoke testing verifies the most basic functionality of your application or service.

Unit testing verifies the most fundamental components of your program or library, such as an individual function or method.

Integration testing verifies that multiple software components work together to form a complete system.

Regression testing helps determine whether code, configuration, or other changes affect the software’s overall behavior.

Sanity testing involves testing each major component of a piece of software to verify that the software appears to be working and can undergo more thorough testing.

You can use a capture-and-replay system to automatically build your UI tests.

Usability testing is a form of manual testing that verifies an application’s behavior from the user’s perspective.

User acceptance testing (UAT), like usability testing, focuses on an application’s behavior from the user’s perspective.

If you don’t specify dependency, jobs within the stage can run in any order or run in parallel.

The goal of performance testing is to improve the speed, scalability, and stability of an application.

Security testing ensures that applications are free from vulnerabilities, threats, and risks.

XSLT stands for XSL Transformations, or eXtensible Stylesheet Language Transformations.

A snowflake is a unique configuration that can’t be reproduced automatically, and is typically a result of configuration drift.

Lead time measures the total time elapsed from the creation of work items to their completion.

Cycle time measures the time it takes for your team to complete work items once they begin actively working on them.

Burndown charts focus on remaining work within a specific time period.

PMD is a source code analyzer. It finds common programming flaws like unused variables, empty catch blocks, unnecessary object creation, and so forth.
There is an Apache Maven PMD Plugin which allows you to automatically run the PMD code analysis tool on your project’s source code and generate a site report with its results.

WhiteSource provides WhiteSource Bolt, a lightweight open source security and management solution developed specifically for integration with Azure DevOps and Azure DevOps Server.

SonarCloud is a cloud service offered by SonarSource and based on SonarQube. SonarQube is a widely adopted open source platform to inspect continuously the quality of source code and detect bugs, vulnerabilities and code smells in more than 20 different languages.

Multistage builds are useful to anyone who has struggled to optimize Dockerfiles while keeping them easy to read and maintain.

SonarQube is a set of static analyzers that can be used to identify areas of improvement in your code. It allows you to analyze the technical debt in your project and keep track of it in the future. With Maven and Gradle build tasks, you can run SonarQube analysis with minimal setup in a new or existing Azure DevOps.

Canary deployment
With canary deployment, you deploy a new application code in a small part of the production infrastructure.

In a rolling deployment, an application’s new version gradually replaces the old one.

A blue/green deployment is a change management strategy for releasing software code. Blue/green deployments, which may also be referred to as A/B deployments require two identical hardware environments that are configured exactly the same way.

Use Azure SQL Database Deployment task in a build or release pipeline to deploy to Azure SQL DB using a DACPAC or run scripts using SQLCMD.

Upstream sources enable you to manage all of your product’s dependencies in a single feed.

By default, Azure DevOps Server uses TCP Port 8080.

The basic idea behind Continuous Assurance (CA) is to setup the ability to check for “drift” from what is considered a secure snapshot of a system.

Feature isolation is a special derivation of the development isolation, allowing you to branch one or more feature branches from main or from your dev branches.

The Azure Artifacts Credential Provider automates the acquisition of credentials needed to restore NuGet packages as part of your .NET development workflow.

GitHub App uses the Azure Pipelines identity.

Cleanup unnecessary files and optimize the local repository:

git gc --aggressive

Prune all unreachable objects from the object database:

git prune

Hosted pool (Azure Pipelines only): The Hosted pool is the built-in pool that is a collection of Microsoft-hosted agents.

Visual Studio Codespaces is built to accommodate the widest variety of projects or tasks, including GitHub and integrating debugging.

Azure Data Explorer is a highly scalable and secure analytics service that enables you to do rich exploration of structured and unstructured data for instant insights.

Use a variable group to store values that you want to control and make available across multiple pipelines.

The Develop branch contains pre-production code. When the features are finished then they are merged into develop.

Deploy latest and cancel the others: Use this option if you are producing releases faster than builds, and you only want to deploy the latest build.

You can manage technical debt with SonarQube and Azure DevOps.

Yeoman makes it easy to create Terraform modules.

Terratest provides a collection of helper functions and patterns for common infrastructure testing tasks, like making HTTP requests and using SSH to access a specific virtual machine.

ACR Tasks supports automated container image builds when a container’s base image is updated, such as when you patch the OS or application framework in one of your base images.

Thanks a lot for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.