SLS机器学习最佳实践:时序异常检测

SLS平台可以使用机器学习函数进行相关的时序异常检测,具体的相关函数可以使用如下函数进行异常检测,帮助用户提高巡检和分析的效率,具体的函数列表如下,具体的地址i H 0 E如下:https://h_ I W 8 p Telp.aliyun.ca @ mom/documH & 6 qent_detail/93210.html
SLS机器学习最佳实践:时序异常检测
我们可以通过上面的函数组合,可以得到如下的巡检x ` v k o c o操作图标,我们将逐步拆解如何得到对应的结果:
SLS机器学习最佳实践:时序异常检测

  • 最复杂的巡检SQL函数如下所示:
* |
SELECT res.name AS INSTANCE
FROM
(SELECT ts_anomaly_filter(INSTANCE, ts, ds, preds, probs, cast(5 AS bigint), cH # oast(1 AS bigint)) AS res
FROM
(SELECT INSTANCE,
res[1] AS ts,
re? , B Z 4 _s[2] AS dL 8 e )s,
res[3]| e g L x O AS preds,
res[4] AS uppersf % 0 ;,
res[5] AS lowL E X p | - H 4 vers,
res[6] AS probs
FROM
(SELECTZ w x 3 INSTANCE,
array_transpose(ts_$ i e l / z 9 apredicate_arma(TIME, value, 5, 1, 1, 1, 1, TRUE)) AS res
FROM
(SELECT (TIME/1000) AS TIME,
labels['instance'] AS INSTANCE,
vq & s L alue
FROM
(SELECT promql_query_range] 0 ( ] , v j S('1 - avg(irate(node_cpu_seconds_total{instance^ [ H W z =~".*",mode="idle"}[10m])) by (instancd C H O ! j O 0 le) ', '10m') AS t
FROM metr( y  ~ @ Sics)
ORDER BY TIME{ - _ AB { ~ m V ; kSC)
GROUP BY INSTANCE)))

我们对上面的o k iSQL进行拆解,看看怎么一步一步获取到对应的结果!

  • 我们先获F U L L H B ^得到对应要检测t - k 7的对象:
* |
SELECT (TIME/1000) AS TIME,
labels['instance'] AS INSTANCE,
value
FROM
(SELECT promql_query_range('6 N B1 - avg(M p t )irate(node_cpu_seconds_totaG b o + g 6 /l{instance=~".*",mode="idle"}[10m])) by (instance) ', '10m') AS t
FROM m_ 0 m h ! eetrics)

这里,从SLS中使用PromQL获取对应N个监控对象每10分钟的cpu idle指标,为了N D B z g n a形象的展示出来,我们可以使用流图将对应的图进行可视化。
SLS机器学习最佳实践:时序异常检测

  • 我们要针对获取的N条线,进t 0 G K E P m行异常检测。SLS提供了异常检测函数,同时支持group by模Y r X k U ;式,我们可以较为放方便的使用上述方法进行巡检
* |
SELECT INSTANCE,
ts_predicate_arma(TIME, value, 5, 1, 1, 1.0, 1.0, TRUE)
FROM
(SELECT (TIME/1000) AS TIME,
labels['instance'] AS I7 p ] I )NSTANCE,
vK b : n p ^ & ualue
FROM
(SELECT promql_query_range= L & | 8 E N C('1 - avg(irate(node_cpu_seconds_total{instan0 _ N 6 Mce=~".*",mode="idle"}[10m])) by (instanc = ] Pe) ', '10m') AS t
FU z aROM metrics))
GROUP BY INSTANCE

利用上述` u U v 6 p t F $ 9 ? o i 8 . %sql,我们可以轻松的对N条线进行异常检测,我们将会得到如下结果,表格的第一列是表示insU 1 b = y Z , Ytance实例,第二列对应的每条线的检测结果。但是对于这么复杂的结^ C & d ? , 6 C果,该如何进行操作呢?
SLS机器学习最佳实践:时序异常检测
针对ts_predicate_arma 这个函数,我们提供了对应的函数对模型结果进行解析和转换,我们先检测W o F X ~结果中的数组进行转置操作。

* |
SELECT INSTANCE,
arra? { P I K % _ E Vy_transpose(ts_predicate_arma1 t u 6 k a O B(TIME, value, 5, 1, 1, 1.0, 1.0, TRUE)) AS res
FROM
(SELECTG + & ? J (TIME/1000) AS TIME,
labels['instance'] AS INSTANCE,
value
FROM
(SELR G n * NE} / q d vCT promql_query_range('1 -L 9 7 p avg(irate(node_cpu_seconds_total{instance=~".*",mode="idle"}[10m])) by (iR 3 } t n = :nstance) ', '10m'= H C) AS t
FROM metrics))
GROUP BY INSTANCE

使用 array_transpose 我们已经对函数结果做了转换,将对应的结果做unnest操作后,获取到对应的结果,进行后续的处理。

* |
SELECz T j z q Z 3 DT INSTANCE,
rd P i - Wes[1] AS ts,
res[2] AS ds,
res[3] AS preds,
res[4] AS uppers,
rei p S t S js[5] AS lowers,
res[6% O Q W] AS probs
FROM
(SELECT INSTANCE,
a2 @ _ C { 7 $ / Yrray_transpose(ts_predicate_arO [ R $ l 4 A Ima(Z _ 8 eTIME, value, 5, 1, 1, 1.0, 1.0, TRUE)) AS res
FROM
(SELECT (TIME/1000) AS TIME,
labels['instance'] AS INSTANCE,
value
FROM
(SELECT promql_query_range('1p . . - avg(ira] o h u ( 2 X R Kte(node_cpu_seconds_total{instance=~".*",mode="i( R * + 7 Idle"}[10m])) by (instance) ', '10m') AS t
FROM metrics))
GROUP BY INSTANCE)

我们得到了对应的结果如下图所示:
SLS机器学习最佳实践:时序异常检测
针对这样的结果,我们3 6 D F u筛选出满足我们的异常,我们使用ts_anomaly_K N q h / H w ufilter这个函数来解决这个问题,具体的操作可以参看文档 https:B i i k B % # _//help.aliyun.com/document_detail/93210.html
这就是我们最初复杂SQL的全部内容了。我们得到对应的表格结果后,可以通过SLS这边对应的跳转配置完成对应的分析操作,具体可配置Q 7 2下:
SLS机器学习最佳实践:时序异常检测
配置DrillDown操作将数据进行可视化操纵
SLS机器学习最佳实践:时序异常检测
SLS机器学习最佳实践:时序异常检测
SLS机器学习最佳实践:时序异常检测
这样就可以实现对应的选择跳转了。