40

Elasticsearch SQL 用法详解

 5 years ago
source link: https://mp.weixin.qq.com/s/mJjoTskWyT9CM-O_nCeDUQ?amp%3Butm_medium=referral
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

yIv6BfB.gif

本文详细介绍了不同版本中 Elasticsearch SQL的使用方法,总结了实际中常用的方法和操作,并给出了几个具体例子。

上篇文章回顾:看示例学awk

一、5.x中ES-SQL用法

Elasticsearch 5.x版本中,SQL功能还没有集成到Elasticsearch源码中,需要下载第三方插件后才能使用,配置过程如下: 

1.安装ES-SQL依赖node npm 

ES-SQL 5.x版本以后,安装需要依赖node和npm,先安装node和npm,安装后在检查node及npm的安装,命令如下: 

yum -y install nodejs npm 
node -v  
npm -v

2.下载ES-SQL并安装 

然后切换到ES的根目录下,执行如下命令,下载并安ES-SQL插件: 

./bin/elasticsearch-plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/5.6.3.0/elasticsearch-sql-5.6.3.0.zip

离线包安装可以执行: 

./bin/elasticsearch-plugin install file:/elasticsearch-sql-5.6.3.0.zip

3.重启ES服务 

执行完上述三步,你就可以使用SQL探索数据了,以kibana中的使用为例:

UrABNzF.jpg!web

二、6.4 Elasticsearch SQL用法

1

Elasticsearch SQL支持的数据类型

首先我们看下Elasticsearch SQL和标准SQL中数据类型的对应关系:

n26femY.jpg!web

2

Elasticsearch SQL的使用方式

Elasticsearch SQL支持三种client: REST Interface, command-line,JDBC

2.1 REST Interface

MrQbYjB.jpg!web

建议先在kibana中测试(可以一次执行多个SQL),查询通过之后把查询copy到项目中进行测试。

6.3+ Elasticsearch SQL有个非常实用的功能,就是可以用translate api把SQL语句翻译成ES DSL语句,对于学习DSL感到头痛的同学有福啦。

veqyAnn.jpg!web

2.2 command-line

命令行界面的进入方式:

./elasticsearch-sql-cli  IP:PORT(本机ip和es的端口)

进入后的界面如下:

zIj2Mzi.png!web

命令行一般作为SQL测试时使用。

2.3 JDBC

该组件为X-Pack中的收费组件,感兴趣的同学可以参考官方文档: https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-jdbc.html

3

常用SQL语句

注意:查询单个索引名一定要用""引上,否则会报错

* 查看当前用户所有的索引: “SHOW TABLES;”

YVzIj2y.jpg!web

精准查询某个索引:“SHOW TABLES LIKE ‘indexname’;”

R7BV7v6.jpg!web

通配符查询某些索引:“SHOW TABLES LIKE ‘ ’;”

aqUzAfY.jpg!web

NjyE3qR.jpg!web

RfIZZbA.jpg!web

*查看某个索引结构: “DESCRIBE table;” 或者 “DESC table;”

MBF7zau.jpg!web

上面两个命令都是“SHOW COLUMNS [ FROM | IN ] ? table”命令的别名

Q3mQjqy.jpg!web

*查看函数: “SHOW FUNCTIONS [ LIKE? pattern? ]?” 

精准查询某个函数:

I3YVjuB.jpg!web

通配符查询某些函数:

RJBVR3f.jpg!web

BnyYjyE.jpg!web

查看所有函数:

IJrAz2Z.jpg!web

常用的聚合函数: 

SELECT MIN(value_1) min, MAX(value_1) max, AVG(value_1) avg,SUM(value_1) sum,COUNT(*) count,COUNT(DISTINCT value_1) dictinct_count FROM "micloud_es_sink_zhouyongbo_test-2018.10.19”;

SELECT 语句的语法排序如下:

SELECT select_expr [, ...]
[ FROM table_name ]  
[ WHERE condition ] 
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count ] ]

*限定返回数据的条数: “limit” 

SELECT * FROM "micloud_es_sink_zhouyongbo_test-2018.10.19” limit 10 ;

注意SQL中的limit比fetch_size中的优先级高,例如下面的例子返回的是5条 :

{ 
  "query": "SELECT * FROM "micloud_es_sink_zhouyongbo_test-2018.10.19” limit 5", 
  "fetch_size":10 
}

*排序: “order by + 字段名字 + asc/desc”

SELECT * FROM "micloud_es_sink_zhouyongbo_test-2018.10.19” ORDER BY value_1 asc/desc;

根据多个字段排序: 

select city c,value_1 + 1 vp from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by c,vp order by c desc,vp asc;

i26r6nQ.jpg!web

*WHERE根据条件查询:

WHERE后面跟ES复杂数据类型: 

SELECT first_name FROM index WHERE first_name.raw = ‘John’  ;

WHERE后面跟多个查询条件: 

SELECT * FROM micloud_es_sink_zhouyongbo_test* where city=‘北京' and value_1=8 ORDER BY value_1 desc ;

*group by分组查询: 

根据单个字段分组查询:

select city,count(city) as count_city,sum(value_1) as count_value_1 from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by city;

juUbaqq.jpg!web

根据多个字段分组查询: 

select city,count(city) count_city,sum(value_1) count_value_1 from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by city,value_1;

QVviIzZ.jpg!web

对于比较长的字段,也可以对该字段声明别名,并对别名进行分组查询,声明字段别名的“as”可省略: 

select city c,count(city) count_city,sum(value_1) count_value_1 from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by c,value_1;

RRrIFb2.jpg!web

还可以对某字段进行计算,然后按照计算结果分组查询: 

select city c,value_1 + 1 vp from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by c,vp;

rmyM73b.jpg!web

*HAVING过滤分组结果(ES-SQL引擎同样会在分组之后计算HAVING语句):

Select city c,count(*) count from "micloud_es_sink_zhouyongbo_test-2018.10.19" group by c having count > 53834;

uEZZfu2.png!web

*查询嵌套类型:

select * from zhouyongbo_test04 where love.kaishu=‘鲁公’;

FNRNrei.png!web

*用通配符查询多个索引:

Mry2aaa.jpg!web

注意被查询索引必须有相同的mapping,否则会有如下报错:

常用的方法和操作汇总:

*比较操作: 

Equality (=) 

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 = 6 limit 5;

Inequality (<> or != or <=>) 

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 <> 6 limit 5;

Comparison (<, <=, >, >=) 

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 >= 6 limit 5;

BETWEEN

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 between 6 and 8 limit 5;

IS NULL/IS NOT NULL

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 is not NULL limit 5;

*逻辑操作:

AND

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 > 5 and value_1 < 7 limit 5;

OR

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where value_1 = 5 or value_1 = 7 limit 5;

NOT

select * from "micloud_es_sink_zhouyongbo_test-2018.10.19" where not value_1 > 5 limit 5;

*数学运算操作: 

Add (+)

select  1 + 1 as x;

Subtract (infix -) 

select  1 - 1 as x;

Negate (unary -) 

select  - 1 as x;

Multiply (*) 

select  6 * 6 as x;

Divide (/) 

select  30 / 5 as x;

Modulo or Reminder(%) 

select  30 % 7 as x;

*数学函数:(分为通用函数和三角函数两部分 ):

通用函数: 

ABS:求数字的绝对值 

select ABS(value_1) from "micloud_es_sink_zhouyongbo_test-2018.10.19" limit 5;

CBRT:求数字的立方根,返回double

select value_1 v,CBRT(value_1) cbrt from "micloud_es_sink_zhouyongbo_test-2018.10.19" limit 5;

CEIL:返回大于或者等于指定表达式最小整数(double)

select value_1 v,CEIL(value_1) from "micloud_es_sink_zhouyongbo_test-2018.10.19" limit 5;

CEILING:等同于CEIL

select value_1 v,CEILING(value_1) from "micloud_es_sink_zhouyongbo_test-2018.10.19" limit 5;

E:返回自然常数e(2.718281828459045)

select  value_1,E(value_1)  from  "micloud_es_sink_zhouyongbo_test-2018.10.19"  limit  5;

ROUND:四舍五入精确到个位

select ROUND(-3.14);

FLOOR:向下取整

select FLOOR(3.14);

LOG:计算以2为底的自然对数

select LOG(4);

LOG10:计算以10为底的自然对数

select LOG10(100);

SQRT:求一个非负实数的平方根

select  SQRT(9);

EXP:此函数返回e(自然对数的底)的X次方的值

select  EXP(3);

EXPM1:返回e x  -1

select  EXPM1(3);

三角函数:

DEGREES:返回X从弧度转换为度值

select DEGREES(x);

RADIANS:返回X从度转换成弧度的值

select RADIANS(x);

SIN:返回X的正弦

select SIN(x);

COS:返回X,X值是以弧度给出的余弦值

select COS(角度);

TAN:返回参数X,表示以弧度的切线值

select TAN(角度);

ASIN:返回X的反正弦,X的值必须在-1至1范围内,返回NULL

select ASIN(x);

ACOS:返回X的反正弦,X值必须-1到1之间范围否则将返回NULL

select ACOS(x);

ATAN:返回X的反正切

select ATAN(x);

SINH:返回X的双曲正弦值

select SINH(x);

COSH:返回X的双曲余弦值

select COSH(x);

*日期和时间处理相关方法:

YEAR:

SELECT YEAR(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS year;

MONTH_OF_YEAR() or MONTH():

SELECT MONTH(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS month;

WEEK_OF_YEAR() or WEEK():

SELECT WEEK(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS week;

DAY_OF_YEAR() or DOY(),效果等同于EXTRACT(<datetime_function> FROM <expression>):

SELECT DOY(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS day;

DAY_OF_MONTH(), DOM(), or DAY():

SELECT DAY(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS day;

DAY_OF_WEEK() or DOW():

SELECT DOW(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS day;

HOUR_OF_DAY() or HOUR():

SELECT HOUR(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS hour;

MINUTE_OF_DAY():

SELECT MINUTE_OF_DAY(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS minute;

MINUTE_OF_HOUR() or MINUTE():

SELECT MINUTE(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS minute;

SECOND_OF_MINUTE() or SECOND():

SELECT SECOND(CAST('2018-10-23T16:59:27Z' AS TIMESTAMP)) AS second;

如上就是6.4 Elasticsearch SQL支持的主要用法了,如果在优化SQL语句之后还不满足查询需求,可以拿SQL和DSL混用,ES会先根据SQL进行查询,然后根据DSL语句对SQL的执行结果进行二次查询,下面是个小例子:

POST /_xpack/sql?format=txt 
{ 
    "query": "SELECT * FROM library ORDER BY page_count DESC", 
    "filter": { 
        "range": { 
            "page_count": { 
                "gte" : 100, 
                "lte" : 200 
            } 
        } 
    }, 
    "fetch_size": 5
}

这个查询就会先根据“query”后面的SQL进行查询,然后用执行“filter”和“fetch_size” DSL语法对查询结果进行过滤,进而返回最终结果。

参考文档: 

6.4.0 Elasticsearch SQL新特性简介:

https://www.elastic.co/cn/products/stack/elasticsearch-sql  

6.4.0 Elasticsearch SQL使用文档:

https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-sql.html

BRfAnui.jpg!web


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK