1 Star 0 Fork 0

王浩天 / MyNote

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
Elasticsearch.md 5.55 KB
一键复制 编辑 原始数据 按行查看 历史
王浩天 提交于 2022-05-16 17:36 . 2022年2月-5月

Elasticsearch

1、安装和配置Elasticsearch

下载地址:https://www.elastic.co/cn/downloads/elasticsearch

下载后解压,进入bin目录,打开elasticsearch.bat即可启动本地ES服务。

打开浏览器访问网 http://localhost:9200/ 测试ES服务是否启动成功

配置Elasticsearch

配置文件:\config\elasticsearch.yml

可在配置文件中修改端口号、是否启动安全验证(8.0.0版本)等。

2、Kibana

Kibana是一个针对Elasticsearch的开源分析及可视化平台,用来搜索、查看交互存储在Elasticsearch索引中的数据。使用Kibana,可以通过各种图表进行高级数据分析及展示.

下载地址:https://www.elastic.co/cn/downloads/kibana

注意:下载的Kibana版本要和Elasticsearch版本一致,否则可能会报错

配置Kibana

配置文件:\config\kibana.yml

# Kibana is served by a back end server. This setting specifies the port to use.
#server.port: 5601

# The URLs of the Elasticsearch instances to use for all your queries.
# elasticsearch.hosts: ["http://localhost:9200"]
 
 # If your Elasticsearch is protected with basic authentication, these settings provide
# the username and password that the Kibana server uses to perform maintenance on the Kibana
# index at startup. Your Kibana users still need to authenticate with Elasticsearch, which
# is proxied through the Kibana server.
# elasticsearch.username: "elastic"
# elasticsearch.password: "pass"
# Kibana is served by a back end server. This setting specifies the port to use.
#server.port: 5601

# The URLs of the Elasticsearch instances to use for all your queries.
 elasticsearch.hosts: "http://172.18.20.7:9200"
 
 # If your Elasticsearch is protected with basic authentication, these settings provide
# the username and password that the Kibana server uses to perform maintenance on the Kibana
# index at startup. Your Kibana users still need to authenticate with Elasticsearch, which
# is proxied through the Kibana server.
 elasticsearch.username: "elastic"
 elasticsearch.password: "123456"

注意:不用删除原本存在的空格

3、使用Kibana对ES进行基本操作

进入Kibana界面,点击 “Dev Tools”

(1)新增索引

PUT test_index

test_index是自定义的索引名称

(2)查询

查询所有数据

GET test_index/_search
GET/POST test_index/_search
{
  "query": {
    "match_all": { }
  },
  "sort": [
    {
      "datetime": "desc" # 根据datetime降序排列
    }
  ]
}

查看mapping:

GET test_index/_mapping

(3)新增字段:

字段名是url,字段类型为text。

POST test_index/_mapping
{
	"properties":{
		"url":{
			"type":"text"
		}
	}
}

(4)批量更新数据:

这只所有文档中的url字段值为"http://tianya.com"

POST test_index/_update_by_query
{
  "query": {
    "match_all": {}
  },
  "script": {
    "source": "ctx._source.url = params.last",
    "lang": "painless",
    "params": {
      "last":"http://tianya.com"
    }
  }
}

source的值为对应脚本,painless是一种语法,last是参数名 ,后跟的是参数值

4、使用python对Elasticsearch进行操作

安装elasticsearch包

pip install elasticsearch==7.14.0

要安装与ES服务端相对应的版本。

import json
import os
from elasticsearch import Elasticsearch, helpers
from time import time

def generateESInfo(info, esIndex):
    return {
        "_index": esIndex,
        # _id的值如果不写,上传时系统会自动生成
        "_source": {
            "title": info['title'],
            "content": info['content'],
            "recordDate": info['recordDate'],
            "createDate": int(time()),
            "status": [0],
            "source": info['source'],
            "url": info['url'],
            "originData": info
        }
    }


def importLocalData(dataFile):
    esClient = Elasticsearch() # 不写参数默认连接本地ES服务
    # 连接远端
    # esClient = Elasticsearch(hosts=[{"host": "172.18.20.7", "port": "9200"}],http_auth=('elastic', '123456'))
    esIndex = "myindex" # 索引名称
    batchSize = 200

    lineIndex = 0
    actions = []
    count = 0
    with open(dataFile, "r", encoding="utf8") as robj:
        for line in robj:
            lineIndex += 1
            info = json.loads(line.strip())
            actions.append(generateESInfo(info, esIndex))
            if len(actions) == batchSize:
                try:
                    helpers.bulk(esClient, actions)
                    actions = []
                    count += batchSize
                    print(count)
                except Exception as e:
                    print(e)
                    print(count)
                    return
    if len(actions) > 0:
        count += len(actions)
        helpers.bulk(esClient, actions)
        print(count)

importLocalData(os.path.join(os.path.abspath("."), "forum_data2.json")) # 将本地文件forum_data2.json上传至ES

5、基本概念

ElasticSearch 数据架构的主要概念(与关系数据库Mysql对比)

MySQL ElasticSearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index Everything is indexed
1
https://gitee.com/wanghaotian123/my-no.git
git@gitee.com:wanghaotian123/my-no.git
wanghaotian123
my-no
MyNote
master

搜索帮助