background
As the use of SLS alarms becomes more and more in-depth, some user's action policies will be particularly complicated to configure, and sometimes users can be simplified by creating multiple action policies, but in some scenarios, users cannot create multiple action strategy. For example, users want to manage the alarms of their various monitoring systems through SLS, so they use the open alarm function of SLS. In this case, users generally only create one open alarm application for one monitoring system, which can only correspond to one Action strategy, so as the number of situations that require dynamic dispatch of alarms increases, the action strategy will expand dramatically, resulting in the following situations:
- Can't save in console
- Loading is too slow when the front end is modified
- Increased alert delay
Therefore, for the above problems, this paper introduces three optimization schemes.
Scheme comparison
Use alert strategy to split action strategy | Use the SDK to compress action strategy content | Use dynamic recipients | |
Applicable scene | Applicable to situations where you are familiar with the alarm strategy and have obvious alarm labels and labeling features. Advantages: Clear management and less error-prone Disadvantages: Troublesome configuration | Suitable for users who are proficient in using the alarm SDK and familiar with alarm-related DSL syntax. Advantages: can greatly simplify action strategies. Disadvantages: high learning costs and error-prone | Applicable to those who have their own enterprise user management system, or cannot assign action strategies Advantages: Simple configuration on the SLS side Disadvantages: Users need to implement a webhook service that provides the ability to dynamically assign notifiers, and only supports SMS, voice and email notification channels |
Use alert strategy to split action strategy
When configuring the route merging strategy for the alarm strategy, you can use different group merging according to some alarm information, and different group merging can choose different action strategies, so manually change the rest of the configuration of each group merging to the same as If the default alarm strategy is consistent, then the purpose of splitting the action strategy can be achieved. (In the grouping and merging of the default alarm strategy, select Custom for the merge base, select the alarm rule ID and the item where the rule is located for the alarm attribute, select all for the alarm label, select 1 second for the first wait, 15 seconds for the change wait, and 1 minute for the repeat wait)
As shown in the figure below, if an action strategy is used, the action strategy must consider not only the case where the appName in the label is app0, but also the case where the appName is app1. After splitting according to the method shown in the figure below, then the action In strategy 0, only the case where appName is app0 needs to be considered, and in action strategy 1, only the case where appName is app1 needs to be considered, thus completing the splitting of the action strategy.
Finally, when creating an alarm monitoring rule or opening an alarm application, select the alarm policy created above.
Use the SDK to compress action strategy content
When the SLS console configures the action strategy, because it needs to save some UI information of the node, the configuration content when storing the action strategy will be very large, and it is easy to exceed the size limit of the resource data, resulting in failure to save from the console. . If the action strategy is managed through the SDK, the additional UI information on the console can be omitted, and the size of the action strategy will be much smaller. For example, create an action strategy with the following code.
package main import ( "fmt" sls "github.com/aliyun/aliyun-log-go-sdk" ) var ( // The service entry of the log service. The creation resource must be a river source region. endpoint = "cn-heyuan.log.aliyuncs.com" // Aliyun access key AccessKey. For more information, see Access Keys. Alibaba Cloud account AccessKey has access to all API s, which is very risky. It is strongly recommended that you create and use RAM users for API access or daily operation and maintenance. accessKeyId = "" accessKeySecret = "" // Create a log service Client. client = sls.CreateNormalInterface(endpoint, accessKeyId, accessKeySecret, "") ) func main() { actionPolicy := &sls.ResourceActionPolicy{ ActionPolicyId: "test-action-policy", ActionPolicyName: "Test Action Policy", PrimaryPolicyScript: "if alert.labels.appName == \"app0\":\n fire(type=\"sms\", users=[\"user1\"], groups=[], oncall_groups=[], receiver_type=\"static\", external_url=\"\", external_headers={}, template_id=\"sls.builtin.cn\", check_quota=\"true\", period=\"any\")\n stop()\nif alert.labels.appName == \"app1\":\n fire(type=\"email\", users=[\"user2\"], groups=[], oncall_groups=[], receiver_type=\"static\", external_url=\"\", external_headers={}, template_id=\"sls.builtin.cn\", check_quota=\"true\", period=\"any\")\n stop()\nfire(type=\"webhook_integration\", integration_type=\"dingtalk\", webhook_id=\"user3\", template_id=\"sls.builtin.cn\", period=\"any\")", SecondaryPolicyScript: "", EscalationStartTimeout: "10m", EscalationInprogressEnabled: false, EscalationInprogressTimeout: "30m", EscalationEnabled: true, EscalationTimeout: "1h", } record := &sls.ResourceRecord{ Id: actionPolicy.ActionPolicyId, Tag: actionPolicy.ActionPolicyName, Value: sls.JsonMarshal(actionPolicy), } err := client.CreateResourceRecord("sls.alert.action_policy", record) fmt.Println("[create action policy]", err) }
The script expansion of the DSL syntax corresponding to the first column of action strategies is as follows:
if alert.labels.appName == "app0": fire(type="sms", users=["user1"], groups=[], oncall_groups=[], receiver_type="static", external_url="", external_headers={}, template_id="sls.builtin.cn", check_quota="true", period="any") stop() if alert.labels.appName == "app1": fire(type="email", users=["user2"], groups=[], oncall_groups=[], receiver_type="static", external_url="", external_headers={}, template_id="sls.builtin.cn", check_quota="true", period="any") stop() fire(type="webhook_integration", integration_type="dingtalk", webhook_id="user3", template_id="sls.builtin.cn", period="any")
After it is created, click Edit on the console to create the action policy as shown in the figure below. The action strategy created through the SDK has no UI information, but it can still run normally.
The action strategy with UI information corresponding to the above action strategy is shown in the figure below.
Use dynamic recipients
SLS provides a dynamic receiver function, which can be used to set dynamic receivers for alarm notifications through the Webhook service. The Webhook service office can not only return the contacts who need to be notified of alarms according to the user model of SLS, but also dynamically assign alarms to achieve the same capabilities as action policies. Not only that, because action policies cannot support special content (such as alarms) fire_results field) for dynamic dispatch, so in this case you must use the dynamic receiver method.
As shown in the figure below, after using dynamic recipients, the action strategy only needs one action node, which becomes concise.
This article is the original content of Alibaba Cloud and may not be reproduced without permission.